Strumenti Utente

Strumenti Sito


ddupes

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisioneRevisione precedente
Prossima revisione
Revisione precedente
ddupes [2011/10/11 22:46] – 2.3 pietroddupes [2017/01/20 19:10] (versione attuale) – [What is this?] true edit pietro
Linea 5: Linea 5:
  
 ===== What is this? ===== ===== What is this? =====
 +
  
 **ddupes** is a python program which extends fdupes action to directories. **ddupes** is a python program which extends fdupes action to directories.
  
 **ffdupes** (//"fast fdupes"//) is an enhanced version of fdupes. **ffdupes** (//"fast fdupes"//) is an enhanced version of fdupes.
 +
 +**Update:** at the time of writing this page, I ignored the existence of //many// other command line tools to find duplicate files: for instance, in Debian you can find not only fdupes, but also rdfind, hardlink, finddup, duff. I //totally ignore// how they compare to ffdupes: it is reasonable that they outperform it. I didn't find, instead, any replacement for ddupes. Notice the different tools are not compatible as interface (arguments and output), so ddupes is not able to use their output.
  
 fdupes/ffdupes //"finds duplicate files in a given set of directories"//. fdupes/ffdupes //"finds duplicate files in a given set of directories"//.
Linea 28: Linea 31:
 necessarily read //all// files it must compare: instead, it first tries to necessarily read //all// files it must compare: instead, it first tries to
 compare the heads, and reads the rest only if they match. compare the heads, and reads the rest only if they match.
 +
 +A test of larger size (thanks, Florian Bruhin!), ran with 2.5 TB of data, in
 +~727 000 files, gave the following results:
 +  * fdupes:  6 Hours 23 Minutes
 +  * ffdupes: 4 Hours 19 Minutes
 +  * ddupes:  40 Minutes
  
 That said, in the worst case in which there are many files which are almost That said, in the worst case in which there are many files which are almost
Linea 35: Linea 44:
  
 If ffdupes is used with the "--algorithm" option set to "adler32", it will If ffdupes is used with the "--algorithm" option set to "adler32", it will
-run statistically slower, but faster in the worst case (in particular, it will+run slower on average, but faster in the worst case (in particular, it will
 run faster than fdupes in //all// cases). run faster than fdupes in //all// cases).
  
Linea 68: Linea 77:
 of members) groups of duplicates, which reside in directories which are very of members) groups of duplicates, which reside in directories which are very
 similar but not identical. This should be a quite remote eventuality, but if you similar but not identical. This should be a quite remote eventuality, but if you
-do find some patologic case, please report.+do find some pathological case, please report.
  
 ===== Who should I blame if this sucks? ===== ===== Who should I blame if this sucks? =====
Linea 74: Linea 83:
 Pietro Battiston - <me@pietrobattiston.it> Pietro Battiston - <me@pietrobattiston.it>
  
-Last version of ddupes can always be found at +Last version of ddupes can always be found at http://www.pietrobattiston.it/ddupes. 
-http://www.pietrobattiston.it/ddupes.+The source repo can be obtained with 
 +  git clone git://pietrobattiston.it/ddupes 
 +and browsed at http://www.pietrobattiston.it/gitweb
  
 ===== Requirements ===== ===== Requirements =====
  
 ddupes and ffdupes are written in Python, so you need python to run them. ddupes and ffdupes are written in Python, so you need python to run them.
ddupes.1318366012.txt.gz · Ultima modifica: 2011/10/11 22:46 da pietro