Reduce Subversion pristine copy

Z ZděchovNET
Verze z 27. 7. 2022, 17:20, kterou vytvořil Chronos (diskuse | příspěvky)
(rozdíl) ← Starší verze | zobrazit aktuální verzi (rozdíl) | Novější verze → (rozdíl)
Skočit na navigaci Skočit na vyhledávání

Subversion VCS stores copy of each file in pristine directory (formerly called textbase) located in hidden .svn directory in working copy root directory. Storing copy for every file means that working copy takes twice as much size as user stored files alone. This may not be problem for small repository but if you want to store for example 30 GB of photos, music, video or other big binary files in repository and want to checkout working copy on machine with just 60 GB or smaller drive then you have a problem. Also during checkout twice as content size is written to disk which lead to higher disk load.

Also this slows down every checkout as two copies of downloaded files needed to be created on filesystem instead one.

Possible methods of reducing pristine size

Subversion client built-in mechanism

The feature could be implemented directly to the client in some later 1.9+ version.

There is already opened Issue 525 named "allow working copies without text-base/". But the authors want to stay with original subversion principle that storage is cheap and network is slow so they want to keep copy of files.

File deduplication of working copy

It could possibly work on file system level with existing file systems or as virtual file system. But file deduplication on system level need more CPU power and more memory. There are also some copy-on-write file systems which could be usable if subversion client would make copy of file by using system service function for copying. Close to such idea is for example Btrfs but it uses copy-on-write for different purpose. To avoid losing data during power outage thanks to that new data are not overwritten over old data but written to different place.

One example could be Scord which solves this for versions <= 1.6 on Linux platform.

Theoretically file system with deduplication support could be used. This could be realized under Linux as file mounted as loop device. ZFS file system support deduplication but it is not part of Linux kernel.

Make pristine files zero size by helper program

It maybe possible to temporary make pristine file empty and update their content manually by invoking some helper program. Such program could check svn status and download correct content of modified files from repository to be able to commit them. Set zero size to other pristine files. The program should be able to work with SVN sqlite database to read file name to pristine name relation. Also it should be multi-platform.

zero-pristine.sh:

#!/bin/bash

CWD=$(pwd)
let "i=0"
for F in $(find $CWD/.svn/pristine -name *.svn-base -type f -size +0);
do
chmod 666 $F
cat /dev/null >$F
chmod 444 $F
let "i+=1"
done

echo "$i svn pristine files zeroed"

If pristine files are set to empty then all svn operation could be used except these related to file content as svn diff or svn commit of modified files. To be able also work with file diff, pristine file content should be restored from repository somehow.

Alternative VCS

Unfortunately there are not many other VCS systems which can download just single copy of files during checkout.

  • CVS - old predecessor of Subversion
  • Bazaar - can work in client-server mode and checkout just single copy of files. Not developed anymore.
  • BVersion - a centralised version control system for managing binary files like images, audio, and video.
  • Boar VCS

External links