<!DOCTYPE html>
<html lang="en-US">
    <head>
        <meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>
        <title>Tagged PDF, PDF/A and PDF/UA Compliance</title>
        <link href="../common.css" rel="stylesheet" type="text/css"/>
        <style>
            table {
                width: 100%;
            }
            #exampleTable {
                margin-top: 0.8cm;
                margin-bottom: 0.1cm;
                text-align: center;
            }
            #exampleTable > caption {
                color: #666666;
                font-size: 0.8em;
                font-weight: bold;
                padding-bottom: 1pt;
            }
            #columns {
                display: flex;
                justify-content: space-between;
            }
            #rightBlock {
                width: 36%;
                overflow: hidden;
            }
            #rightBlock img {
                height: 19.44cm;
            }
            #leftBlock {
                width: 59%;
            }
            #leftBlock h2 {
                margin-top: 0cm;
            }
            #leftBlock + * {
                clear: right;
            }

            .newChapter {
                margin-top: 1cm;
            }

        </style>
    </head>
    <body>
        <h1>Tagged PDF, PDF/A and PDF/UA Compliance</h1>

        <div id="columns">
            <div id="leftBlock">
                <h2>Tagged PDF</h2>

                <p>Tagged PDF files contain information about the structure of the document. The information about the structure
                is transported via so-called "PDF tags". Tagging a PDF makes it more accessible to screen readers, handhelds and similar devices.
                Enabled tagging also improves the copy and paste behavior. For example, copying a whole paragraph in a tagged PDF created with PDFreactor
                will ignore the line breaks which are displayed in the PDF document. Furthermore tagging applies reflow.</p>

                <p>Using the <code>addTags</code> configuration property, you can add PDF tags to the PDF documents generated with PDFreactor. If you are generating
                a PDF from HTML documents, the HTML elements are automatically mapped to the corresponding PDF tags, so all you have to
                do is setting this property to enable tagging.</p>

                <p>The following example maps the HTML element image to the PDF tag "Figure", and the content of its alt attribute to an alternative description
                for this tag.</p>

                <div class="code"><code>img {
    -ro-pdf-tag-type: "Figure";
}
img[alt] {
    -ro-alt-text: attr(alt);
}</code></div>
    
                <p>The screenshot (taken from Adobe Acrobat DC) on the right shows that PDFreactor is capable to tag even complex structures such as tables properly.
                The table below was placed on the bottom of the page to demonstrate that PDFreactor wont repeat the &lt;table&gt; or &lt;thead&gt; tag even though
                the table splits onto another page.</p>
    
                <p>A tagged PDF will often be bigger then an equivalent PDF file that does not include PDF tags. You can enable the full compression mode to
                reduce the document size. To do so, the configuration property <code>fullCompression</code> can be used in the PDFreactor integration:</p>
    
                <div class="code"><code>config.setFullCompression(true);</code></div>
            </div>

            <div id="rightBlock">
                <img src="tabletags.png" alt="Adobe Acrobat DC screenshot"></img>
            </div>
        </div>

        <table id="exampleTable">
            <caption>Example table</caption>
            <thead>
                <tr>
                    <th>Employee</th>
                    <th>Mail</th>
                    <th>Phone</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>John Doe</td>
                    <td>johne.doe@example.com</td>
                    <td>202-555-0152</td>
                </tr>
                <tr>
                    <td>Austin King</td>
                    <td>austin.king@example.com</td>
                    <td>202-555-0191</td>
                </tr>
                <tr>
                    <td>Edward Alsop</td>
                    <td>edward.alsop@example.com</td>
                    <td>202-555-0113</td>
                </tr>
                <tr>
                    <td>Brian Mitchell</td>
                    <td>brian.mitchell@example.com</td>
                    <td>202-555-0131</td>
                </tr>
            </tbody>
        </table>

        <h2 class="newChapter">PDF/A Conformance</h2>

        <p>PDF/A differs from PDF by prohibiting features ill-suited to long-term archiving, such as
        font linking (as opposed to font embedding).</p>

        <p>The PDF/A standard does not define an archiving strategy or the goals of an archiving system.
        It identifies a "profile" for electronic documents that ensures the documents can be reproduced exactly the
        same way using various software in years to come. A key element to this reproducibility is the requirement for
        PDF/A documents to be 100% self-contained. All of the information necessary for displaying the document in the same
        manner is embedded in the file. This includes, but is not limited to, all content (text, raster images and vector
        graphics), fonts and color information. A PDF/A document is not permitted to be reliant on information from external
        sources (e.g. font programs and data streams), but may include annotations (e.g. hypertext links) that link to 
        external documents.</p>

        <p>PDFreactor supports the creation of all PDF/A conformant files.</p>

        <p>Many companies and government organizations worldwide require PDF/A conformant documents.
            Tagged PDFs are a requirement of Section 508 of the American Rehabilitation Act.</p>

        <p>PDF/A-1a is the most strict PDF/A standard while the newer PDF/A standards are more lenient, e.g. allowing
        transparency and attachments.</p>

        <h3>Common PDF/A conformance requirements</h3>

        <table>
            <thead>
                <tr>
                    <th>PDF/A restriction</th>
                    <th>PDFreactor actions</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>All used fonts are embedded.</td>
                    <td>PDFreactor ignores the option to disable font embedding.</td>
                </tr>
                <tr>
                    <td>All images are embedded.</td>
                    <td>Images are always automatically embedded by PDFreactor.</td>
                </tr>
                <tr>
                    <td>Multi-media content is prohibited.</td>
                    <td>Embedding objects is automatically prevented by PDFreactor when PDF/A conformance is set.</td>
                </tr>
                <tr>
                    <td>JavaScript is prohibited.</td>
                    <td>No JavaScript is embedded when PDF/A conformance is set. (This does not prohibit JavaScript in the
                    source HTML document to be processed during conversions)</td>
                </tr>
                <tr>
                    <td>Encryption is disallowed.</td>
                    <td>This is automatically prevented when the PDF/A conformance is set.</td>
                </tr>
                <tr>
                    <td>The PDF must be tagged.</td>
                    <td>This is automatically done by PDFreactor when PDF/A conformance is set.</td>
                </tr>
                <tr>
                    <td>Metadata included in the PDF is required to be standard-based XMP.</td>
                    <td>This is automatically done by PDFreactor when PDF/A conformance is set.</td>
                </tr>
                <tr>
                    <td>Colors are specified in a device-independent manner.</td>
                    <td>In PDFreactor colors are defined either as RGB or CMYK. When PDF/A conformance is set,
                    one of these color spaces has to be set in conjunction with a color space profile.
                    CMYK requires an ICC profile to be set, RGB colors use a default sRGB profile, if no other is set.
                    Using RGB colors in CMYK PDF/A documents or vice versa is prohibited.
                    Color keywords and shades specified via the "gray" function are converted to the appropriate color 
                    space losslessly. </td>
                </tr>
            </tbody>
        </table>

        <h3>PDF/A-1a specific conformance requirements</h3>

        <table>
            <thead>
                <tr>
                    <th>PDF/A-1a restriction</th>
                    <th>PDFreactor actions</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>Transparency is disallowed.</td>
                    <td>PDFreactor will ignore certain kinds of transparency of images. Other occurrences of transparency will cause an exception
                    to be thrown.</td>
                </tr>
                <tr>
                    <td>Attachments are disallowed.</td>
                    <td>This is automatically prevented when PDF/A-1a conformance is set.</td>
                </tr>
            </tbody>
        </table>

        <p>To create a PDF/A conformant document, the configuration property <code>conformance</code> can be used in the PDFreactor integration:</p>

        <div class="code"><code>config.setConformance(Conformance.PDFA3A);</code></div>

        <p>If CMYK colors are used in a document to be converted into a PDF/A-conformant file, an Output Intent has to
        be set. This is possible to use the following API calls:</p>

        <div class="code"><code>Configuration config = new Configuration();

OutputIntent outputIntent = new OutputIntent();
outputIntent.setIdentifier("ICC profile identifier");

<span class="comment">// Use this if you are loading the ICC profile via URL</span>
outputIntent.setUrl("URL/to/ICC/profile");

<span class="comment">// Use this if you want to specify the ICC profile's binary data</span>
outputIntent.setData(iccProfileBinaryData);

config.setOutputIntent(outputIntent);</code></div>

        <p>The <code>identifier</code> property is a string identifying the intended output device or production condition in human-
        or machine-readable form. The <code>url</code> property points to an ICC profile file while the <code>data</code> property 
        contains data of such a profile.</p>

        <div class="note">
            <p>When PDF/A conformance is set, encryption, restrictions, comments, full compression and other non PDF/A-conformant features are automatically
            overridden, regardless of their own settings.</p>
            <p>Setting PDF/A-1a conformance generates PDFs with Adobe PDF version 1.4 in which some PDF tags are forbidden e.g. &lt;tbody&gt;.
            PDFreactor will skip all forbidden tags automatically, but handle table headers correctly.</p>
        </div>

        <h2 class="newChapter">PDF/UA Conformance</h2>

        <p>PDF/UA (PDF/Universal Accessibility) is the informal name for ISO 14289, the International Standard for accessible PDF technology.
        A technical specification intended for developers implementing PDF writing and processing software, PDF/UA provides definitive terms 
        and requirements for accessibility in PDF documents and applications. For those equipped with appropriate software, conformance with 
        PDF/UA ensures accessibility for people with disabilities who use assistive technology such as screen readers, screen magnifiers, 
        joysticks and other technologies to navigate and read electronic content.</p>

        <p>PDF/UA can be combined with PDF/A to create PDFs that are conformant with both standards simultaneously. For this, PDFreactor offers 
        combined conformance constants like this:</p>

        <div class="code"><code>config.setConformance(Conformance.PDFA3A_PDFUA1);</code></div>

    </body>
</html>