Asked  7 Months ago    Answers:  5   Viewed   21 times

I know 1000s of similar topics floating around. I read at lest 5 threads here in SO But why am I still not convinced about DVCS?

I have only following questions (note that I am selfishly worried only about Java projects)

  • What is the advantage or value of committing locally? What? really? All modern IDEs allows you to keep track of your changes? and if required you can restore a particular change. Also, they have a feature to label your changes/versions at IDE level!?
  • what if I crash my hard drive? where did my local repository go? (so how is it cool compared to checking in to a central repo?)
  • Working offline or in an air plane. What is the big deal?In order for me to build a release with my changes, I must eventually connect to the central repository. Till then it does not matter how I track my changes locally.
  • Ok Linus Torvalds gives his life to Git and hates everything else. Is that enough to blindly sing praises? Linus lives in a different world compared to offshore developers in my mid-sized project?

Pitch me!

 Answers

17

Reliability

If your harddisk silently starts corrupting data, you damn well want to know about it. Git takes SHA1 hashes of everything you commit. You have 1 central repo with SVN and if its bits get silently modified by a faulty HDD controller you won't know about it till it's too late.

And since you have 1 central repo, you just blew your only lifeline.

With git, everyone has an identical repo, complete with change history, and its content can be fully trusted due to SHA1's of its complete image. So if you back up your 20 byte SHA1 of your HEAD you can be certain that when you clone from some untrusted mirror, you have the exact same repo you lost!

Branching (and namespace pollution)

When you use a centralised repo, all the branches are there for the world to see. You can't make private branches. You have to make some branch that doesn't already collide with some other global name.

"test123 -- damn, there's already a test123. Lets try test124."

And everyone has to see all these branches with stupid names. You have to succumb to company policy that might go along the lines of "don't make branches unless you really need to", which prevents a lot of freedoms you get with git.

Same with committing. When you commit, you better be really sure your code works. Otherwise you break the build. No intermediate commits. 'Cause they all go to the central repo.

With git you have none of this nonsense. Branch and commit locally all you want. When you're ready to expose your changes to the rest of the world, you ask them to pull from you, or you push it to some "main" git repo.

Performance

Since your repo is local, all the VCS operations are fast and don't require round trips and transfer from the central server! git log doesn't have to go over the network to find a change history. SVN does. Same with all other commands, since all the important stuff is stored in one location!

Watch Linus' talk for these and other benefits over SVN.

Tuesday, June 1, 2021
 
van_folmert
answered 7 Months ago
83

Just a quick comment to remind you that:

  • those migrations often offer the opportunity to reorganize the sources, not along modules (each with one repositories) but rather along a functional domain split (several modules for a same given functional domain being put in the same repository).

Then submodules are to be used, as a way to define a configuration.

  • Git is alright, but from Linus's admission himself, to put everything into one repository can be problematic.

[...] CVS, ie it really ends up being pretty much oriented to a "one file at a time" model.

Which is nice in that you can have a million files, and then only check out a few of them - you'll never even see the impact of the other 999,995 files.

Git fundamentally never really looks at less than the whole repo. Even if you limit things a bit (ie check out just a portion, or have the history go back just a bit), git ends up still always caring about the whole thing, and carrying the knowledge around.

So git scales really badly if you force it to look at everything as one huge repository. I don't think that part is really fixable, although we can probably improve on it.

And yes, then there's the "big file" issues. I really don't know what to do about huge files. We suck at them, I know.


Those two aforementioned points advocate for a more component-oriented approach for large system (and large legacy repository).

With Git submodule, you can checkout them in your project (even if it is a two-steps process). You have however tools than can make the submodule management easier (git.rake for instance).


When I'm thinking of fixing a bug in a module that's shared between several projects, I just fix the bug and commit it and all just do their updates

That is what I describe in the post Vendor Branch as the "system approach": everyone works on the latest (HEAD) of everything, and it is effective for small number of projects.
For a large number of modules though, the notion of "module" is still very useful, but its management is not the same with DVCS:

  • for closely related modules (aka "in the same functional domain", like "all modules related to PNL - Profit aNd Losses - or "Risk analysis", in a financial domain), you do need to work with the latest (HEAD) of all components involved.
    That would be achieved with the use of a subtree strategy, not in order for you to publish (push) corrections on those other submodules, but to track works done by other teams.
    Git allows that with the extra-bonus that this "tracking" does not have to take place between your repository and one "central" repository, but can also take place between you and the local repository of the other team, allowing for a very quick back-and-forth integration and testing between projects of similar nature.

  • however, for modules which are not directly in your functional domain, submodules are a better option, because they refer to a fix version of a module (a commit):
    when a low-level framework changes, you do not want it to be propagated instantaneously, since it would impact all the other teams, which would then have to drop what they were doing to adapt their code to that new version (you do want though all the other teams to be aware of this new version, in order for them to not forget to update that low-level component or "module").
    That allows you to work only with official stable identified versions of other modules, and not potentially un-stabled or not fully tested HEADs.

Wednesday, June 9, 2021
 
Evernoob
answered 6 Months ago
58

As of version 2.0, it is not possible to make a so-called "narrow clone" with Mercurial, that is, a clone where you only retrieve data for a specific sub-directory. We call it a "shallow clone" when you only retrieve part of the history, say, the last 100 revisions.

As you say, there is nothing in the common DAG-based history model that excludes this feature and we have been working on it. Peter Arrenbrecht, a Mercurial contributor, has implemented two different approaches for narrow clones, but neither approach has been merged yet.

Btw, you can of course split an existing Mercurial repository into pieces where each smaller repository only has the history for a specific sub-directory of the original repository. The convert extension is the tool for this. Each of the smaller repositories will be unrelated to the bigger repository, though — the tricky part is to make the splitting seamless so that the changesets keep their identities.

Thursday, July 1, 2021
 
Packy
answered 6 Months ago
96

I know there are quite a few git-based wikis such as git-wiki, WiGit and gitit. A simple google search will bring up many others, I'm sure.

I also know of some git-based bug trackers such as ticgit which basically lets you keep your tickets in a separate branch of a git repository. There's also DisTract.

But I'm not aware of anything else aside from Fossil that really tries to do what it does in one combined tool.

I'm curious what your experiences with Fossil have been. It's one of those things I find interesting but not been in a position to actually use as yet.

Friday, August 13, 2021
 
motanelu
answered 4 Months ago
98

While git offers a CVS server, this server is very limited. You can't create tags or branches, and the git branches appears as CVS modules. Also you can't convert your CVS repo into git in such a way that the file revision numbers are equal afterwards (git cvsserver creates an own database of file revisions for each git branch at the time when they are needed).

OTOH you can use git as a cvs frontend. The workflow is that you use git cvsimport to pull the history from the CVS server, and use git cvsexportcommit to export some git commits into a local CVS checkout. There is an article in Tsuna's blog with more details about this.

Another way to tackle this problem is to analyze why your coworkers don't want to switch. Here it was that they simply did not know/care about new ways of VCS, and where into a we-always-did-it-this-way habit. Having used mercurial in one pilot project was the key to convince the other team members here.

Friday, August 20, 2021
 
Kevin
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share