Best Image Formats for Archival Photography
Knowing which format to use when capturing new photographs can be confusing, especially if we hope for them to be a good fit for a long-term, sustainable, archived collection.
Images come in hundreds of formats, including proprietary formats. Knowing which format to use when capturing new photographs can be confusing, especially if we hope for them to be a good fit for a long-term, sustainable, archived collection. Here we put together some information and resources around available archival photography formats and the latest recommendations for the field of Archaeology and Cultural Heritage.
First off, we at CoDA are not archivists, but we are experts of digital media and part of our mission is to help people become DIY archivists so that their digital memories will last. So let’s talk a little about our definition of “archival”. The public perception of archival is something belonging in an archive, a place where culture goes to die. Digital archives, however, bring new life through accessibility to objects and knowledge that was once only available through temporary exhibition or visitation to underground storage facilities. We must broaden our terminology and accept the definition of archival from the Society for American Archivists in their multivolume Archival Fundamentals series II:
archival, adj. … … – 2. Records · Having enduring value; permanent. – 3. Records media · Durable; lacking inherent vice; long-lived; see archival quality. … … – 6. Computing · Information of long-term value that, because of its low use, is stored on offline media and must be reloaded, or that is in a form that must be reconstructed before use.
A glossary of archival and records terminology / Richard Pearce-Moses. 2005
Archival-quality file management starts at the moment of capture and extends through the entire lifetime of your files and all of their various versions. In this series, we focus on digital still images and how to capture, manage, and share these images so they have enduring value. We start here by introducing you to some file formats, a background to the pros and cons to capturing in these formats, and how to version these images to archive offline and share online. This deserves some discussion of the differences between original files and derivative files.
1. Original Files
When we use the term “Original files”, we refer to the file that is created when you first capture an image. This file is the truest digital version of what your camera’s sensor recorded when you took the still image. It is important to differentiate between original and any other copies, versions, or exports of the file (see derivatives) because every time a file is copied or exported it is essentially a different file, and with each version, there is a risk of loss of quality, bits, or information. Imaging at the highest quality possible is your best chance at creating lasting images.
CAMERA RAW FILES
When possible, and when you have an option (e.g. you are using a DSLR camera to capture digital images), you should set your camera to capture a format that is suitable to be archived as your original, usually the camera raw file format.
Raw formats have many advantages as archival formats:
- Greater control over the rendering process, especially when using non-destructive editing software;
- Richer data content (greater bit-depth, but also a wider color gamut);
- Uncompressed format, but still much smaller in size than uncompressed rendered files such as TIFF (about one third in size!);
But because they are proprietary, they are always at risk of being discontinued without notice in the future. Moreover, there are several formats, so if you have multiple cameras or switch tools over the years, this will affect consistency in your archive. Unless you have a solid preservation plan for your collections that includes a periodic review (with the possibility of converting data) at least every 5 years, consider converting your raw originals to DNG.
In general, archiving your untouched camera originals is recommended, but because many raw formats are proprietary, derivative DNG (Digital Negative) converted from raw are an acceptable exception and have become a cross-brand standard in the field.
Most DSLR and some point-and-shoot digital cameras will even have an option to save photographs in DNG, so you may not have to convert them later.
DNG is a container format that will preserve the camera raw sensor data just as proprietary raw files do, but in addition to this:
- Although patented by Apple, the DNG format is an open, documented format, whose specifications are freely available to software vendors and users; this guarantees a wider adoption and assures that the format won’t go unsupported overnight;
- DNG supports unique hash verification, a special tagging feature that allows to detect data corruption and even unique file identification;
- DNGs are generally smaller than camera raw formats due to their efficient lossless compression, but they can become bigger if they embed one or more full-size previews of the image (usually JPEG) that include any edits and resizing specified in an external editing software;
- DNG supports EXIF, IPTC and XMP metadata embedded in the same container file (whereas some proprietary formats will save metadata as external text file known as “sidecar files”);
TIFF (Tagged Image File Format) is universally considered a good format for archiving photo collections.
- Just like DNG, TIFF is a documented format (also an ISO standard), likely to be supported long into the future;
- TIFF can provide the highest quality when saved uncompressed or with lossless compression;
- TIFF can store layers, edits, EXIF, and IPTC metadata.
TIFF should be preferred to compressed formats when scanning printed pictures or acquiring pictures in a different way than through a DSLR (for example, iPhones can be set to capture TIFF). The drawback of this file format is its considerable size (much larger than RAW files, DNG, or other rendered formats such as JPEG). Adobe holds the copyright on the TIFF specification (TIFF 6.0)
Standard JPEG (Joint Photographics Experts Group) is only a good format for archiving if your camera originals are JPEGs. They are a valid format for derivative – but more about that below.
Because JPEG is a lossy compressed format that will cause your image to be compressed (and lose something off its initial quality) every single time it’s saved, renamed, or moved. For this reason, if you do choose JPEG for archiving, make sure you store your files in one (or better, multiple) locations early on and don’t touch them anymore, using non-destructive editing software for managing your edits to the pictures and its metadata such as Adobe Lightroom, Apple’s iPhoto or Apple’s Aperture.
An increasing popular alternative, JPEG2000 offers a fully lossless compression option as well as an encoding/decoding stream that makes it a more efficient format for presentations and use in applications. Many cultural institutions use this format including the Library of Congress, the Harvard University Library, Library and Archives Canada, Chronicling America website and the Google Library Project.
RECOMMENDATIONS FOR ARCHIVAL PHOTOGRAPHY IN ARCHAEOLOGY AND CULTURAL HERITAGE
While there is no particular reason why archaeologists and cultural data custodians might prefer different formats while ensuring long-term sustainability of their digital images, the following resources will provide further information about regulation compliance and archival workflows.
Amongst the major digital archiving repositories for archaeology, the ADS (UK) provides informations about a range of formats for image capturing and recommends using TIFF or DNG files for long-term preservation purposes (TIF, also mentioned on the resource, is nothing but a legacy form of the TIFF file extension for older applications), while TDar (US) will accept a wider range of formats (.TIFF/.TIF, .GIF, .JPEG, .BMP, .PICT and .PNG). The Library of Congress lists TIFF, JPEG2000, PNG, JPEG, DNG, BMP, GIF as Preferred for digital photographs acquisition and archiving, based on adoption, transparency, technical characteristics and established international standards.
Overall, just like in the general digital arena, TIFF seems to be the contest winner when it comes to sustainable archiving for archaeology and cultural heritage.
Most common image formats and their release dates
2. Derivative Files
You may also want to archive derivative files such as those prepared for printing or delivery on a particular project, edited copies, a selection of the best or most relevant, or just in order to have a more ready-to-use copy of your pictures. The available “rendered” formats vary in quality and effectiveness, and should be chosen depending on your ultimate goals, storage capacity and access expectations. As we mentioned in our recent post on long-term sustainability, if forced to choose, always prioritize quality and safety of your core original archive over any derivative.
TIFF (TAGGED IMAGE FILE FORMAT)
TIFF is a common format for storing derivatives from camera raw originals, especially because it can be saved both uncompressed and compressed to variable rates, it can store layers, and can include embedded EXIF and IPTC metadata. TIFF can also store much richer color information than compressed formats such as JPEG and PNG.
JPEG (JOINT PHOTOGRAPHICS EXPERTS GROUP)
JPEG, probably the most common option for exporting derivative files from an archiving format as well as sharing them over the web and through other applications (text editors, embedded in presentations and documents…) uses lossy compression at an adjustable rate. When using a non-destructive editing software, the exported JPEG will include all the edits and can embed EXIF and IPTC metadata.
PNG (PORTABLE NETWORK GRAPHICS)
PNG, just like its predecessor GIF (Graphics Interchange Format), a lossless compression format, is designed as a “publishing” format for images, or a rendered image format, specifically thought for the internet. Its edge over JPEG and TIFF is a relatively small size and the support for alpha layers (transparencies), but this format was conceived for file exchange over the internet and not professional-quality print graphics, not long-term access.