Asked  7 Months ago    Answers:  5   Viewed   37 times

Can you provide a list of (all, or the most common) the operations or commands that can compromise the history in git?

What should be absolutely avoided?

  1. Amend a commit after a push of this one (git commit/git push/git commit --amend)
  2. Rebase toward something that has already pushed

I would like this question (if it has not already asked before somewhere else) to become some kind of reference on the common avoidable operations on git.

Moreover I use git reset a lot, but am not completely aware of the possible damage I could do to the repository (or to the other contributors copies). Can git reset be dangerous?

 Answers

20

knittl has already compiled a good list of the commands that rewrite history, but I wanted to build upon his answer.

Can you provide a list of [...] the operations or commands that can compromise the history in git? What should be absolutely avoided?

First of all, there is nothing wrong with rewriting/deleting history per se; after all, you probably routinely create feature branches, keep them strictly local, then delete (after merging them or realising they lead you nowhere) without thinking twice about it.

However, you can and certainly will run into problems when you locally rewrite/delete history that other people have already access to and then push it to a shared remote.

Operations that should count as rewriting/deleting the history of a local repo

Of course, there are dumb ways of corrupting or deleting history (e.g. tampering with the contents of .git/objects/) , but those are outside the scope of my answer.

You can rewrite history of a local repo in various ways. The section of the Pro Git book entitled Rewriting history, mentions a few

  • git amend --commit
  • git rebase
  • git filter-branch
  • Roberto Tyley's BFG Repo Cleaner (a 3rd-party tool)

Arguably, there are more. Any operation that has the potential to alter or otherwise move a non-symbolic reference (branch or tag) and make it point to a commit that is not a descendant of the branch's current tip should count as rewriting local history. This includes:

  • git commit --amend: replaces the last commit;
  • All forms of rebase (incl. git pull --rebase);
  • git reset (see an example below);
  • git checkout -B and git branch -f: resets an existing branch to a different commit;
  • git tag --force: recreates a tag with the same name but potentially pointing to another commit.

Any deletion of a non-symbolic reference (branch or tag) may also be considered history deleting:

  • git branch -d or git branch -D
  • git tag -d

Arguably, deleting a branch that has been fully merged into another should be considered only a mild form of history deleting, if at all.

Tags are different, though. Deleting a lightweight tag is not such a big deal, but deleting an annotated tag, which is a bona fide Git object, should count as deleting local history.

Operations that rewrite/delete the history of a remote repo

As for as I know, only a git push -f (equivalent to git push --force) has the potential to rewrite/delete history in the remote repository.

That said, it is possible to

  • disable the ability to force-update remote branches to non-fast-forward references, by setting receive.denyNonFastForwards on the server.
  • disable the ability to delete a branch living on a remote repository, by setting receive.denyDeletes on the server.

Moreover I use git reset a lot, but am not completely aware of the possible damage I could do to the repository (or to the other contributors copies). Can git reset be dangerous?

git-reset, as mentioned by knittl, usually changes where a branch reference points. This command can be dangerous, in so far as it can make reachable commits become unreachable. Because a picture speaks a thousand words, consider the following situation:

enter image description here

You're on the master branch, which points at commit D. Now, let's say you run, for instance,

git reset master~2

A soft reset is considered to be the most benign form of reset, because it "only" changes where the current branch points to, but doesn't affect the staging area or your working tree. That said, merely changing where a branch points to in that fashion has ramifications: after that soft reset, you will end up with

enter image description here

Commits C and D, which were reachable from master before the reset, have now become unreachable; in other words, they're not ancestors of any reference (branch, tag, or HEAD). You could say that they're in "repository limbo"; they still exists in your Git repo's object database, but they will no longer be listed in the output of git log.

If you actually found those commits valuable before the reset, you should make them reachable again by making some reference (e.g. another branch) point to commit D again. Otherwise, commits C and D will end up dying a true death when Git runs its automatic garbage collection and deletes unreachable objects.

You can, in theory, fish commit D out of the reflog, but there is always a risk that you will forget about those unreachable commits or won't be able to identify which entry of the reflog corresponds to commit D.

In conclusion, yes, git-reset can be dangerous, and it's a good idea to make sure the current tip of the branch you're about to reset will remain reachable after the reset. If needed, create another branch there before the reset, just in case, as a backup; and if you're sure you want to forget those commits, you can always delete that branch later.

Tuesday, June 1, 2021
 
Naveen
answered 7 Months ago
72

You can use git submodules to "link" to other projects. See here - http://help.github.com/submodules/

Tuesday, July 27, 2021
 
PHLAK
answered 5 Months ago
64

Traderhunt Games traced this to some antivirus software, which makes sense. The reason has to do with the process Git uses to update a configuration entry.

When git config runs and is told to change one or more configuration key = value field(s), such as changing core.filemode to false, the way it implements this is to use a three-step process:

  1. Create a new, empty file (.git/config.lock), using the OS service call that creates a file, or fails if the file already exists. If this step fails, that indicates that another git config (or equivalent) command is already running and we must wait for it to finish before we do our own git config.

  2. Read the existing configuration file, one key = value entry at a time. If the key is the one that we care about, write the new key = value value, otherwise copy the existing key = value.

    There's some fanciness here with keys that are allowed to repeat, vs keys that should only occur once; see the --replace-all and --unset-all options to git config for details. Note that git config itself knows little to nothing about most key and value pairs, and you can invent your own key/value pairs as long as you pick keys that Git is not using today and won't be using in the future. (How you figure out what Git will and won't use in, say, the year 2043, I have no idea. :-) ) The main exceptions are some of the core.* values, which git config does understand, and several other Git commands may set on their own.

    (Note that --unset is handled much the same as replacing. Like a non-all replace, it only unsets the first matching key = value pair. Unsetting is implemented by simply not writing the given key, instead of writing a replacement key = value. Since git config is simply working through the file line-by-line, that's easy to do. Also, if your key = value is totally new, Git handles this by reading through all the lines, noticing that it did not replace any existing key, and hence adding a new key = value line. This is complicated a bit by the fact that the keys are listed section-by-section, but the logic itself is simple enough.)

  3. Finally, having read through the entire existing configuration and completely written out the new one (using fflush and fsync and fclose and so on as needed), git config invokes the OS service to rename a file, in order to rename .git/config.lock to .git/config. This is where the process is failing in this particular case.

The rename, if it succeeds, has the effect of putting the new configuration into effect and removing the lock file, all as one atomic operation: any other Git command sees either the complete old configuration, from the original .git/config file, or the complete new configuration, from the new .git/config file that was known during construction as .git/config.lock.

Another StackOverflow question asks: Will we ever be able to delete an open file in Windows? The accepted answer includes this statement: An anti virus product that does not open files with full sharing (including deletion) enabled is buggy. If that's the case—that is, if this particular AV software fails to open with the "allow delete" flag, and if such software is buggy, then this particular AV software is the problem and is buggy.

Tuesday, August 3, 2021
 
MannfromReno
answered 4 Months ago
87

If your project was itself a git repo (meaning it has a .git), the presence of the .git could confuse the GitHub Desktop client.
It could be seen as a nested git repo (for which only the gitlink is recorded), and the GUI tries to see it as a submodule.

Try with command line (unzip PortableGit-2.6.3-64-bit.7z.exe anywhere you want and add it to your %PATH%)

cd /your/project
git remote add origin https://github.com/<username>/<yourrepo>
git push

Then reference that same project directory in GitHub Desktop: it will be recognized as a local repo linked to a GitHub one.


Update 2018, as mentioned by t3chb0t in the comments:

The new client desktop.github.com doesn't have this problem anymore.

Wednesday, October 13, 2021
 
adrianbanks
answered 2 Months ago
49

You can use filter-branch to do this. First write a small script that rewrites the tree for a given commit. e.g., Following is something that changes something to something else in the README.md file only if it exists.

if [ -f README.md ]; then
    sed 's/something/something else/g' README.md > tmp
    mv tmp README.md
fi    

Save this as change.sh and then run the following

git filter-branch --tree-filter "/bin/bash $(pwd)/change.sh" HEAD

This will rewrite all the commits going back from HEAD. If you've made a mistake, you can go back to the earlier tree using git reset and try again.

Saturday, November 20, 2021
 
laura
answered 2 Weeks ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share