PDF/A

Digital preservation with PDF archiving (PDF/A)

Companies are confronted with countless documents, such as PDFs in different variants. If these are archived in their original format, there is a danger that they will no longer be legible after many years. Therefore, electronic archiving requires consistent conversion into a stable format that ensures long-term reproducibility and readability. PDF/A as ISO standard is the first choice for PDF archiving. Although the use of PDF/A is not mandatory, the file format has established itself worldwide as the standard for long-term digital preservation. All information necessary for displaying the documents in the same way every time is embedded in the PDF file. PDFreactor supports the creation of PDF/A files for easy and long-lasting archiving of PDF files.

Many companies and government organizations trust PDFreactor for the conversion from HTML to PDF/A conformant documents and the digital archiving of their data.

Did you know: What is PDF archiving?

PDF/A is a standard of the International Organization for Standardization – ISO – that describes the use of the Portable Document Format (PDF) for long-term archiving (A) of electronically stored documents. PDF is an established, platform-independent file format. PDF/A is a reduction of the PDF standard. All functionalities that are not useful for PDF archiving have been deleted. In addition, the standard requires reading applications to follow certain display guidelines.

The Advantages of PDF/A – Electronic archiving of documents

In order to be able to read and work with a document for years, it is important to archive the PDF files. PDF archiving with PDF/A and PDFreactor is the optimal solution and has many advantages. The aim of the PDF/A standard is to create PDF documents whose visual appearance remains intact over time. Compared to a PDF document, a PDF/A requires less storage space, for example. But what is the difference between PDF/A und PDF/X?

Comparison of PDF and PDF archiving

PDF/A is a special form of PDF. The PDF file format is now used worldwide for exchange and storage of electronic documents. PDF/A ensures that the PDF document can still be read without problems decades later. The PDF format does not guarantee long-term legibility or complete independence from software. So to digitally preserve a PDF you have to save the document as PDF/A. PDF/A files are self-describing and all information necessary to read the document are embedded.

Archiving electronic documents with PDF/A or PDF/X?

PDF/X is another sub-form of the Portable Document Format (PDF). Although the ISO standards PDF/A (ISO 19005) and PDF/X (ISO 15930) were developed for different purposes, they have some things in common. Basically, PDF/X describes the requirements for a print template in order to enable the faithful transmission of data from prepress to actual printing. PDF/X is not intended for long-term PDF archiving. With a PDF document, it is possible that certain links and fonts may no longer be available in the future when the file is opened. Therefore, PDF/A requires that all information, especially fonts, is embedded in the file itself. PDF/X, on the other hand requires that printing conditions or output intent is specified in the file. PDFreactor also supports the creation of PDF/X conformant files.

PDF archiving – Standards for the electronic archiving of documents

Over time, the format has been further developed in various standards to improve usability in practice. Today, PDF/A-1, PDF/A-2 and PDF/A-3 are three versions of the original ISO standard. All define with varying degrees of restriction which components of a PDF are absolutely necessary or even prohibited for long-term PDF archiving. The goal of these standards is to ensure the reproduction of the visual appearance as well as the inclusion of the document’s structure. PDFreactor supports all of the following PDF/A conformance levels for archiving PDF files.

PDF/A-1

PDF/A-1 stands for clear visual long-term reproducibility. For PDF archiving a distinction is made between PDF/A-1a (Level A) and PDF/A-1b (Level B). With PDF/A-1b (B = Basic), all inserted images must be firmly embedded in the document so that it functions completely independently. In addition, the text modules must have Unicode representation in order to be reproducible forever. Level A (Accessible) offers clear visual reproducibility, including the ability to map text to Unicode and structure the content of the document in terms of accessibility. 1a provides full compliance with ISO standard and 1b meets all minimum requirements. So PDF/A-1a should be used if the document structure is also intended for display on mobile devices and if the accessibility requirements must be 100% fulfilled for digital preservation of PDF files.

PDF/A-2

In contrast to PDF/A-1, PDF/A-2 allows you to merge multiple files into one container PDF. Just like PDF/A-1, the PDF/A-2 format offers clear, visual reproducibility and meets the requirements of accessibility. In addition, it offers the possibility to process JPEG 2000 and very large page formats. Another important feature of PDF/A-2 is that multiple layers are allowed and OpenType fonts can be embedded. Just as with PDF/A-1, there are different levels in place. The PDF/A levels 2a, 2b and 2u are intended as additions to PDF/A-1 (i.e. PDF a-1a and PDF a-1b). PDF/A-2a enables long-term digital preservation of the semantics and structure of the document to be stored. With PDF/A-2b the focus is on visual reproducibility. PDF archiving of images and graphics is ideally with this standard. At Level 2u, the text is displayed in Unicode, so that it can also be reproduced in other countries thanks to international coding.

PDF/A-3

The current standard format for the archiving of documents is PDF/A-3. Depending on the application, PDF/A-1 and PDF/A-2 formats are still completely sufficient. Different file formats can now be embedded. It is possible to add original data (XML, CAD) without further ado. In addition to the features already possible in the previous format, PDF/A-3 can also be used to provide the file with source formats. PDF/A 3 standard allows you to save attachments such as the original file when archiving PDF files. This can be especially useful for e-mail archiving, because you can convert e-mails together with their attachments to PDF/A to convert everything into a long-term archive.

A closer look to PDF archiving: PDF/UA

To display the contents of a document structure, PDF documents can be tagged with tags. Through tags technologies like Screenreade, OCR (Optical Character Recognition) or voice output and input gain access to the to the content of the website. PDF/UA is used to create barrier-free PDF documents. UA in this case stands for “Universal Accessibility”. With PDFreactor it is also possible to create documents that are PDF/UA compliant. We recommend combining the benefits of PDF/A and PDF/UA for maximum accessibility and archivability.

HTML to PDF/A with PDFreactor

With the help of PDFreactor it is very easy to convert HTML to PDF. You get an excellent PDF document as a result. The PDFreactor is also an electronic archiving system. To create a PDF/A for the electronic archiving of documents the configuration property conformance can be used in the integration of PDFreactor, for example:

config.setConformance(Conformance.PDFA3A);

These are the PDF/A conformance levels supported by PDFreactor:

  • PDF/A-1a
  • PDF/A-1b
  • PDF/A-2a
  • PDF/A-2b
  • PDF/A-2u
  • PDF/A-3a
  • PDF/A-3b
  • PDF/A-3u

It is automatically ensured that the requirements for PDF/A files are fulfilled as far as possible.

PDF archiving restrictions

PDF/A imposes restrictions, which PDFreactor automatically enforces. Some restrictions are listed as examples.

All used fonts and images are embedded. When archiving PDF files, multimedia content is prohibited. All PDF/A file formats must be tagged. When creating PDF/A files encryption and attachments are usually prohibited. While electronic archiving, the metadata contained in the PDF must be standards-based XMP. With PDFreactor no manual intervention is required when archiving PDF files.

Validating PDF/A

Documents are checked for conformance with the ISO standards for PDF and PDF/A documents. The files must be checked for their correctness and long-term readability in order to archive PDF files digitally. If you are in doubt regarding the standard conformance of a particular PDF/A document we recommend to check the PDF file bevor PDF archiving. The validation is optional and can be enabled like this:

config.setValidateConformance(true);

The easiest way to archive electronic documents: PDF archiving with PDF/A and PDFreactor

Those who are legally obliged to archive their documents for a longer period of time cannot avoid the PDF/A format. The PDF/A standards for the electronic archiving of documents make it possible to ensure the reproduction of the visual appearance as well as the inclusion of the documents. The purpose of PDF archiving determines which format you should use. However, it is advisable to always choose the highest level of the standard for electronic archiving. Start your 30-day trial period of the PDF archiving software right now to experience the advantages of the PDFreactor and PDF archiving. Check out our quick start guide to learn more about download, installation, integration and conversion.