Helping libgit2 to grow up
You might have heard of the libgit2 library, a native wrapper on top of git providing programmatic interface to most of the git functionalities.
Products like GitHub and even the newest Microsoft TFS integration with Git make extensive use of the library.
Turns out to be that here, at Plastic SCM, we do use libgit2 too in order to create our bidirectional synchronization with Git!
So I wanted to highlight today the contributions we made through our partner Elego Software in Germany (and our good folk Carlos Martín Nieto). I think is worth noting that most of these contributions might be already helping some other Git based products to shine, and since we made them pass through really tough stress tests, they should be performing great!
- Some missing introspection code was added to git_odb_foreach() and git_packfile_foreach() since we need to loop through the packs remotely created by Git in order to walk them and push them into Plastic
- We asked Carlos to fix some leaks in the indexer code and make it more resource-friendly
- Speed up hashing on packfiles with large objects has been greatly by keeping the state of partially-downloaded objects instead of retrying from the start. It was killing the walking of packs with big files, although we’re not sure this specific area will help other users of the library
- Introduced a delta based object cache to avoid decompressing the same data over and over again. Together with the previous, this change made the fetch as fast as git’s in single thread.
- Fixed the assumption that an object was able to fit in one memory window (32bits systems would crash with objects >16Mb)
"We asked Carlos to fix some leaks in the indexer code and make it more resource-friendly" - so strong
ReplyDelete@Andrius Bentkus... ???? Not sure I follow :O
ReplyDelete