git submodule or git subtree?

September 26, 2017

git submodule and git subtree are two very different beasts, but they are designed to solve similar problems; the internet however is filled with many pages urging people to stay away from git submodule, but I think it’s just a question of understanding the purpose and limitations of the tool, and using the right tool for the job.

Similarities

  • Both git submodule and git subtree are designed to incorporate external git repositories into other git repositories.
  • Both solutions provide a means to tie a certain version of an external component to the local repository, and bundle them together.
  • Both solutions preserve history of the external repository, allowing you to checkout older commits of the external repository.

Dissimilarities

  • git submodule stores references to the external repositories, while git subtree merges the external repo contents into your local repository.
  • git submodule allows you to perform git operations such as checking out a different commit on the submodule, without affecting the other files in the working directory, but doing that with subtrees would affect all files in the working directory.
  • git submodule allows you to checkout different branches, make commits, push to upstream and still return the local submodule to the last state by just executing git submodule --update, without having to perform git resets.

git subtree pros & cons

Pros

  1. External component is merged into the local repository, so nothing extra needs to be done when cloning.
  2. Even if external repository is no longer available, the merged content in the local repository lives on.

Cons

  1. Contributing to upstream of external component is harder.
  2. More invasive operations, as it would involve a merge commit, even if you simply need to checkout the latest version of the external repo locally; reverting to the earlier state would involve a git reset.

git submodule pros and cons

Pros

  1. The external repository functions exactly like a normal git repository; you can freely checkout other commits, make changes locally, without affecting the rest of the working directory.
  2. Easier to contribute to the submodule’s upstream.

Cons

  1. Cloning repositories with submodules requires the submodules to be fetched separately.
  2. If the upstream repository moves elsewhere or is unreachable, the submodule directories would be empty after cloning.

If using git submodule, pay attention to this!

  1. The parent git directory is the authority on the version of the submodule that is checked out.
  2. If you have made changes (and commits) in the submodule, remember to perform a commit in the parent git directory, if you want it to associate the latest commit; else, a git submodule update from the parent directory will take the submodule back to the last registered commit point, not where it may be currently.
  3. When you do a git submodule update, the submodule will always be in a detached head state. If you want to make changes in the submodule, it’s good to first checkout the branch you want to work on, else you could end up with commits not associated with branches.