| CARVIEW |
Select Language
HTTP/2 200
cross-origin-resource-policy: cross-origin
etag: W/"4bf6051c8234855a143e7c79341f213f73747e28c7fcba7fdbf25de1b7259660"
date: Fri, 16 Jan 2026 02:09:49 GMT
content-type: application/atom+xml; charset=UTF-8
server: blogger-renderd
expires: Fri, 16 Jan 2026 02:09:50 GMT
cache-control: public, must-revalidate, proxy-revalidate, max-age=1
x-content-type-options: nosniff
x-xss-protection: 0
last-modified: Sun, 08 Sep 2024 08:38:55 GMT
content-encoding: gzip
content-length: 4389
x-frame-options: SAMEORIGIN
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
tag:blogger.com,1999:blog-6749355461879878917 2024-09-08T01:38:55.066-07:00 Improving Darcs' network performance exlevan https://www.blogger.com/profile/15653410551704557324 noreply@blogger.com Blogger 3 1 25 tag:blogger.com,1999:blog-6749355461879878917.post-1811656828529900078 2010-08-08T20:30:00.000-07:00 2010-08-08T20:30:00.766-07:00 GSoC 2010 Progress Report #3 Last week I was developing a <a href="https://bugs.darcs.net/issue1773">smart server</a> for Darcs. The main challenge in designing a server was that the current code that works with the repository is rather low-level and work on file basis, and making it to work with the smart server is rather untrivial task.<br />
<br />
To solve this problem, I implemented a <a href="https://bugs.darcs.net/issue1483">common interface</a> for working with both smart and dumb protocols, called RepoIO. RepoIO makes it easy to add new protocols for Darcs in the future, as well as provides a convenient high-level API for Darcs library users.<br />
<br />
The disadvantage of this approach is the need for re-implementing of the download code (for example, at the moment RepoIO does not support HTTP pipelining). However, the current code for download needs refactoring anyway, and getting a clean API with a fresh realization in my opinion justifies the rewriting of some pieces of code. <br />
<br />
At current moment changes to the pull command to work with RepoIO have been already made, and soon the first results of smart server's work will be seen. Next week I will finish the implementation of RepoIO for the dumb and smart protocols, as well as the implementation of the server part for CGI and local (to work via SSH). Together these changes will make a working realisation of smart server that is able to serve get and pull commands. Next Sunday I'll make my final report about the completed work. exlevan https://www.blogger.com/profile/15653410551704557324 noreply@blogger.com 0 tag:blogger.com,1999:blog-6749355461879878917.post-7561946104612759423 2010-08-01T23:05:00.000-07:00 2010-08-03T11:55:58.456-07:00 GSoC 2010 Progress Report #2 <p><br />Last week I spent improving and debugging the code of repository packs. For those unfamiliar with this feature: repository packs are two tarballs, basic.tar.gz and patches.tar.gz, containing the copy of repository contents. They are used for faster getting a Darcs repository over the networks, and will be created by 'darcs optimize --http' command, when it will be enabled.<br /></p><p><br />The main changes are:<br /></p><ul><br /><li>while getting a repository via packs, the files hardlink to a darcs global cache,</li><li>small tuning of packs format, to support the <a href="#parget">parallel get using cache</a>,</li><li>further development and debugging of code for parallel get,</li><li>optimization of packing of inventories.</li></ul><br /><a name='more'></a><br /><p><br />The last change resolves <a href="https://bugs.darcs.net/issue1889">issue1889</a>, allowing to get rid of unnecessary files in basic pack, thereby reducing its size. The fact is that with use of a repository, the inventories directory can accumulate quite a large number of unneeded files. In the Darcs unstable repository case, these files represent a significant part of basic pack: unoptimized pack takes 21MB vs 1.7MB with optimization.<br /></p><p><br />Also, I've figured out what caused the <a href="https://bugs.darcs.net/issue1884">issue1884</a> and was giving the wrong message about the success of incomplete darcs get. It turns out that Darcs.Command.Get has the interrupt handler that covers almost the entire code of darcs get, and unconditionally reports about the success on interrupts. The fix is easy, but it conflicts with the rest of my work that is waiting for review, so it will have to wait a bit too.<br /></p><p><br />By the way, this fix will help to do one more optimization, because it clearly defines the time of getting the lazy repository. It turns out that in order to get a lazy repository, it's not necessary to download the entire basic tarball: inventory files at it's end may be obtained later lazily. With this optimization, darcs get of lazy packed repository will download the same files as the "classical" darcs get, only faster.<br /></p><p><br />While getting the packs can be much faster than getting a repository file by file, it also may be much slower if the repository files have been saved in the cache. However, this case also does not necessarily win, e.g, the cache may be on the network share behind a slow connection. Conversely, the "remote" repository can be at arm's length. Or even on the local host. As you can see, things get a little complicated here, and certainly there will be cases when trying to be too clever and guess the way to get the repository (file-by-file using the cache, or using packs) will fail. <br /></p><p id="parget"><br />The way I solve this problem is actually simple: why to choose between two options, if you can use both? So I added to the beginning of basic pack list of files it contains, in reverse order (patches pack doesn't need it, as it can be inferred from inventory). Now when you get the repository the pack is loaded, and when a list of his files is received, they are obtained in parallel, in reverse order. Downloading files from different ends of the list, both download threads eventually discover that the file that they are going to upload already exists. At this point their work ends: pack download is completed. <br /></p><p><br />The only remaining issue of packs I know about is the download realization. The fact is that the current code for downloading files in Darcs lets you use the file only after the download is complete, which is not suitable for my way of using the packs in parallel with the cache. Since I am going to write custom downloading code in my upcoming smart server work, I think it will be easier to provide a common interface for both smart and dumb (including lazy) downloads, instead of trying to alter the current code, which was not designed for lazy download.<br /></p><p><br />Now, when I've finished with most of repository packs (well, almost; there will probably be a couple of rounds of review-amend ping-pong on the <a href="https://bugs.darcs.net/">Darcs bug-tracking system</a>), I'm starting to write the code for smart server. After the server's interface design (I'll post the specification on the <a href="https://wiki.darcs.net/">Darcs wiki</a>), I'll start with writing the client side (it will help to solve the problem with downloading tarballs and put an end to my work on the repo packs sooner). I'll make a next post about my progress on Saturday, August 7.<br /></p> exlevan https://www.blogger.com/profile/15653410551704557324 noreply@blogger.com 1 tag:blogger.com,1999:blog-6749355461879878917.post-7819878160987987886 2010-06-25T19:39:00.000-07:00 2010-06-26T02:25:07.593-07:00 GSoC 2010 Progress Report #1 This summer I work on making Darcs over networks faster. My project consists of two parts:<br /><ol><li>Implement an <a href="https://bugs.darcs.net/issue1771">optimization</a> for getting repository over HTTP. This is done by creating a snapshot of current repository state.</li><li>Create a <a href="https://bugs.darcs.net/issue1773">smart server</a> for darcs, which is able to determine the patches needed by client and send them with response. This will be the most effective way to get/pull a repository, since it reduces the number of roundtrips to minimum.<br /></li></ol>The original complete description of the project can be seen <a href="https://wiki.darcs.net/GoogleSummerOfCode/2010-Network">on the wiki</a>. Note, however, that for smart server more priority is given to the CGI frontend, rather than plain HTTP.<h4>Changes so far</h4>I glad to report that the most work on HTTP optimization is complete and patches are on their way to the Darcs repository. Some notes on implementation:<br /><ul><li>getting an optimized repository results in almost the same copy as for unoptimized one. While inventory files may be split in different ways, semantically resulting repositories remain identical.</li><li>there is still a couple of issues with special cases, like working with cache and handling interrupts; I will hopefully resolve them in nearest time.</li></ul><h4>Next week</h4>At the moment I have started implementing the smart server. Besides the completing of the work on optimize --http issues, on the next week I will refactor the get/pull commands' code, which will result in cleaner API for the server. exlevan https://www.blogger.com/profile/15653410551704557324 noreply@blogger.com 0