FITS
- File ID
- Metadata extraction
Main bottlenecks
- JVM lag
- DROID
- MD5 checksum calculation
- JHOVE
- XSLT compilation
JVM lag
- Time to start up a fresh Java VM
- Happens every time you run "fits.sh"
- Depends on computer, but in measurements was between 0.5s - 10s
JVM lag
Example
- Transfer containing 17,000 files
- Average 2s JVM lag per file
9.4 hours of time wasted!
JVM server
- Maintains a persistent JVM with a class loaded
- Allows CLI tools to be run with zero startup lag
- We contributed a nailgun server startup script to FITS 0.8.0
DROID
- FITS 0.6.x used DROID 3
- Slow startup due to XML parser
- Spent more time parsing XML than IDing files!
DROID
- Archivematica switched to using FIDO
- FITS has since upgraded to DROID 6, which is faster
MD5
- FITS always calculated MD5 for every file
- 10%+ of the time spent on large files
- Archivematica never used it!
MD5
- Submitted a change that makes this configurable
- Included in FITS 0.8.0
XSLT / XML
- FITS uses XSLT to translate tool output to its format
- XSLT is compiled on startup
XSLT / XML
- This only needs to happen once
- But cache is thrown away when FITS quits
- Nailgun fixes this too!