COCKOS
CONFEDERATED FORUMS
Cockos : REAPER : NINJAM : Forums
Forum Home : Register : FAQ : Members List : Search :
Old 08-24-2007, 09:58 PM   #1
trog
Human being with feelings
 
Join Date: Aug 2007
Posts: 2
Default File contents comparison (md5sum) support?

Heyas,

I've searched for an existing thread on this but can't find one; please excuse me if this has been raised before.

I am wondering if anyone has considered adding in support to peform md5 (or sha1, or whatever) hashes of files before doing the sync.

I have a bunch of files that I like to keep sync'ed, but I'm not comfortable relying on file dates or file sizes. I recently discovered that a bunch of files that I had synced relying on date and size (NOT with pathsync) have become corrupt gibberish - I wouldn't have noticed because on the surface they look fine, but the contents are hosed.

What would be ridiculously handy for me is just another checkbox - "ignore md5sums" - which, when unchecked, would perform and md5sum hash on the source and destination file and allow me to opt to overwrite source or destination based on md5sum.

So - I was curious to know if anyone had started working on functionality like this, and/or if anyone else thought it would be a useful feature.

Thanks,
-- trog

Last edited by trog; 08-24-2007 at 10:00 PM. Reason: clarify subject
trog is offline   Reply With Quote
Old 09-01-2007, 01:20 AM   #2
c2R
Human being with feelings
 
Join Date: Nov 2005
Location: Hertford, England
Posts: 23
Default

I like the idea of this feature - I've often been a little concerned about the longer term data integrity of what I'm backing up.....
c2R is offline   Reply With Quote
Old 09-17-2007, 10:14 AM   #3
rob
Human being with feelings
 
Join Date: Sep 2007
Posts: 1
Default

It would definitely be a nice feature. I'd assume that plain Md5 would be a bit slow in some cases though. There must be a way to do a similar check that isn't as costly. Rsync, is able to do so very quickly. Maybe there is some way to pull a checksum directly out of the file system?
rob is offline   Reply With Quote
Old 09-17-2007, 03:08 PM   #4
trog
Human being with feelings
 
Join Date: Aug 2007
Posts: 2
Default

Quote:
Originally Posted by rob View Post
It would definitely be a nice feature. I'd assume that plain Md5 would be a bit slow in some cases though. There must be a way to do a similar check that isn't as costly. Rsync, is able to do so very quickly. Maybe there is some way to pull a checksum directly out of the file system?
I've often considered a "quick and dirty md5sum" variant that would be useful in the situation where you don't want to spend the time doing an entire md5sum.

Basically you could just md5 the first (say) 100k bytes of a file, the middle 100k bytes of a file, and the end 100k bytes of the file.

While this wouldn't be as accurate as doing the full file, you might be able to get "good enough" accuracy to make doing comparisons much faster.

Obviously some testing would be good to tweak the numbers and style (eg, maybe it'd be more effective to md5sum 10k bytes, then skip a big chuck, then the next 10k bytes, and so on).

I'd still have the option though for complete md5 checking for the situations when data integrity is important.
trog is offline   Reply With Quote
Old 09-19-2007, 01:12 PM   #5
c2R
Human being with feelings
 
Join Date: Nov 2005
Location: Hertford, England
Posts: 23
Default

that said... would it be difficult to implement optional complete md5 checksum comparison? I'm sort of nowhere near technical enough to be able to even think about adding it )o:
c2R is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 02:32 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.