Signing PDF Signatures using C# under .NET

 

Monday, May 20, 2019

Creating PDF Digital Signatures using C# under .NET. Yes complex and often confusing. To help you understand how you can sign PDF documents and digital signatures, using ABCpdf .NET and a little simple C#, we have written this – the definitive guide to making and manipulating PDF Digital Signatures using C#.

 

ABCpdf .NET has supported digitally signing and verifying signatures in PDF documents since 2007 - ABCpdf Version 6.

In that time the use of signed PDF documents has gained much more legal traction with many governments now accepting digitally signed PDFs for a number of official purposes.

In EU member states the electronic Identification, Authentication and trust Services (eIDAS) regulations provide a standardized legal framework for accepting electronic identification.

Building on top of this, the PAdES specification provides a set of technical standards for inserting and validating signatures in PDF documents.

Support for the PAdES standard was introduced into ABCpdf with the release of Version 11.3.

So What Is a Digital Signature?

At its simplest a digital signature is the same as a handwritten signature. It shows that you have signed a document electronically, in the same way as you might have signed a document with a pen.

A digital signature has the added advantage that once a document is digitally signed, you can prove that the document has not been changed since it was signed.

More specifically a digital signature is a scheme used to ensure the authenticity of a file such as a PDF document. Authenticity in this instance may refer to:

  • Proof that the document has not been modified since it was signed (non-tampering)
  • Proof that the document was digitally signed by a person or entity (proof of signer)
  • The person or entity that signed the document cannot deny that they signed it (non-repudiation)

Digital signatures rely on Public Key Infrastructure (PKI) a system and set of roles, policies and processes used to manage digital certificates on the internet.

Digital Signature Workflows

Documents can be signed many times for different reasons.

A purchase order document might be prepared by one person, reviewed by another and authorized by a third.

Each person in the workflow might sign the document; each signature results in a new version of the PDF.

Because each signature is specific to a particular version of the document, each signature is only valid for the version that was signed.

Most PDF readers will allow you to view the version for which a specific signature is valid.

Signatures, Keys and Certificates

 

Here is where we get into the meat of the subject. We explain the mathematics behind signatures, the concept of keys and certificates, how we know that a certificate is trusted, signature lifetime and how to extend the limited lifetime of a signature into the indefinite future.

 

The Science Bit

Creating a signature is, at core, a calculation using two very large numbers with a specific mathematical relationship.

This relationship ensures that if you "sign" some data with one number, you can validate that signature using the other.

Signing the data is simply performing a calculation with one number. To validate the signature, you perform a similar calculation with the other number.

A signature which includes all the data in a file is not necessary. We only need a number that is unique to that file. This fingerprint-like number is called a hash.

We create this number or hash using a one-way message digest algorithm like SHA-1.

Keys and Certificates

The number you sign with is called the private key. The number other people validate with is called the public key.

The private key should only be known to the signer. The public key needs to be known by anyone who wants to validate the signature.

The public key is held in a file called a certificate. The private key associated with the certificate is generally held in a separate file. Each certificate contains other information including:

  • The owner of the certificate (name, organization etc.)
  • The certificate that signed this certificate (the issuer)
  • The date the certificate was issued (i.e. signed by the issuer)
  • The date the certificate expires
  • A serial number
  • What the certificate can be used for (key usage)

The Chain of Trust

So how can we trust a certificate? How do we know that the name included in the certificate really is the name of the person who is using it?

A certificate is issued by a Certificate Authority (CA). The CA ensures that the information that is provided to them is correct. When they issue the certificate the data in it is signed using the private key of their CA certificate.

So how do we know we can trust the CA? Well their certificate is in turn issued from another higher level CA which again may be in turn issued and signed by an even higher-level CA. And so on.

This forms a hierarchy - the chain of trust – with your certificate at the bottom, intermediary CAs above and ultimately at the top the final arbiter – the Root CA.

This top level certificate is signed by its own private key instead of that from another certificate. This is called the Root CA and it is the Trust Anchor (TA) for the entire chain of certificates. Somewhere on your computer there is a file which tells it what top level certificates it trusts.

Certificates are provided by Trusted Service Providers (TSP) or resellers. They rely on their Root CA Certificate being trusted to complete the chain of trust. These are companies such as Comodo, GeoTrust, DigiCert and GlobalSign.

Certificate Lifetime and Revocation

Certificates are valid for a limited time. Each certificate contains the date when it was issued and the date it expires. For a signature to be valid, the certificate has to be valid at the time the signature is created.

Should a private key become known to someone else, for example, made public on the internet, the certificate authority can revoke its corresponding certificate.

Software programs validating a signature may check online with the Certificate Authority to establish whether the certificate was revoked at the time the signature was created.

The certificate could also be revoked if the technology it uses becomes compromised.

Keeping the Private Key Safe

Many CAs will only issue you a certificate if the private key is stored safely. This might be on a hardware device called a Hardware Security Module (HSM).

At its simplest this is a small USB key such as the. Gemalto eToken 5110. More complex devices may include rack-mounted server units holding many certificates and private keys.

A HSM holds private keys in a way that makes it practically impossible for any program on the operating system to read them. Instead the digest is signed on the HSM which has a processor and firmware for this purpose. The HSM will likely be password protected further increasing security.

When Was it Signed?

With a basic signature you can specify the signing time using the time on the computer. But computer clock times can be changed and so you can never be sure this is accurate.

A more definitive method of indicating signing time involves using a timestamping authority. This is a trusted server that you can request to sign the signature itself with the addition of a timestamp.

The timestamping certificate that signs the timestamp request is usually returned in the timestamp response.

Some certificates include the URL of a timestamping server considered trustworthy by the CA.

Keeping Signatures for Long Term Archival

 

Signature Longevity

If certificates are valid for a limited period of time, how do we check that a signature is valid after it expires, or indeed if it has been revoked?

Most CAs also operate as Validation Authorities (VAs) and will include information within the certificate to allow checking for certificate revocation. At most basic they may provide embedded Certificate Revocation Lists (CRL).

However CRLs can get very long and so the certificate may also (or instead) contain the URL of an Online Certificate Status Protocol (OCSP) server for retrieving a list of certificates that have been revoked. Each response is itself a signed data object.

It is also possible to add into a PDF document, evidence that the certificates used in the signature were valid at a specific point in time – usually the signing time. This information can be put into an optional dictionary in the document catalog called the Document Security Store (DSS) which may contain:

  • OCSPs - an array of OCSP responses.
  • CRLs - an array of CRLs.
  • Certs – an array of certificates. Ideally this should contain all certificates required to validate the signature, any timestamp and the OCSP and CRL responses.
  • VRI –Validation Related Information. This contains the same information as above but on a per signature basis. It may also have a time or an indirect reference to a timestamp stream for when the entry was created.

This information is referred to as Long Term Validation (LTV) or Long Term Archival (LTA).

Futureproofing Signatures

Over time technology improves and computers get faster. So the algorithms we used to generate our certificates, sign and timestamp our signatures, may get broken or compromised.

To protect against this, we can apply a special signature to a document called a document timestamp. This signature confirms the state of the document at an authoritative time. It proves that at that time, the signature and associated long term validation information had not been tampered with and were viewed as valid.

A document may be repeatedly timestamped using newer signature technologies to prove that no tampering has occurred over time. Repeatedly timestamping a document, using an algorithm that is valid at that time, provides proof of the integrity of the original signature.

What Is PAdES?

PAdES (PDF Advanced Electronic Signatures) is a specialization of the CAdES standard for PDF documents.

CAdES (CMS Advanced Electronic Signatures) is a standard developed by the European Telecommunications Standard Institute (ETSI) to facilitate secure paperless transactions throughout the EU.

PAdES has four baseline profiles that may be very simply expressed as follows:

  • PAdES B-Level (basic signature)
  • PAdES T-Level (PAdES B- with an authoritative timestamp)
  • PAdES LT-Level (PAdES T- with added Long Term Validation information)
  • PAdES LTA-Level (PAdES LT- with added authoritative document timestamp signature)

The quickest reference to the differences between these levels can be found on the ETSI site. This distils the lengthier texts into a manageable summary with cross references for deeper reading.

Under the Hood

 

Enough of the basics and background - time to get our hands dirty. Let's put up the hood and look inside the actual implementation of signed documents. We'll examine the file types involved and the mechanics of actually signing a document.

 

What’s in a Certificate?

Signatures are based on Cryptographic Message Syntax (CMS). The Internet Engineering Task Force (IETF) created the CMS standard (RFC 5652) for producing digests of, signing, authenticating or encrypting any form of digital data. This is based on the syntax of PKCS#7.

CMS relies on a set of Public Key Cryptography Standards (PKCS) which are rather unhelpfully identified only by a number.

  • PKCS#1 defines the mathematical aspects of public and private keys and their use in encryption.
  • PKCS#7 defines the Cryptographic Message Syntax Standard at a technical level in terms of signing, encryption and decryption.
  • PKCS#12 defines an archive file format for PKCS objects.

PKCS#12 is what you will come across most often, in the format of a file with an extension of “.p12” or “.pfx”. This format is like a big bag in which you can store certificates, private keys and other items of data. These files are normally password-protected.

Most PKCS defined objects are defined in terms of Distinguished Encoding Rules (DER) encoded Abstract Syntax Notation One (ASN.1). ASN.1 is a cross-platform interface description language. It defines some basic types such as integers, booleans, character strings, octet strings; as well as structures like lists (sequences) and choices.

ASN.1 is used to define almost all objects including: the certificate, the private key, Online Certificate Status Protocol (OCSP) responses and Certificate Revocation Lists (CRLs), Certificate Signing Requests (CSRs) and even the certificate itself. There are numberous RFCs that cover different objects specified in ASN.1.

X.509 is a standard for encoding public key certificates in ASN.1 format. This is not a file format as such - more a recipe for what must and what might appear in the certificate. Most significantly it mandates the details of the algorithm that should be used to validate the signature.

How is a PDF Signed?

When we sign a PDF document we create a signature area in the PDF. We reserve a space in that area to put the byte range that we are signing and the actual CMS signature. To obtain the signature we first take a digest of the data to be signed – excluding a "hole" in the data where we will be inserting the CMS.

We may use a number of one-way digest algorithms for the CMS such that we get a fingerprint that uniquely identifies the document.

Next we encrypt this digest string, along with any additional signed attributes we wish to place in the CMS, using the private key. We may additionally add any unsigned attributes into the CMS – such as a signature timestamp. Included in the CMS should be the certificate containing the public key used to sign the CMS.

Any timestamp should also include the Timestamp Service Authority (TSA) certificate. In this way a validating software application can immediately determine integrity of the signature – if not the authenticity of the actual certificate.

We may also include the certificate hierarchy so that validation software can build the certificate chain of trust.

What Type of Certificate Can I Use?

The permitted uses of a certificate are listed in the ASN.1 encoded extension attributes of the certificate. There are two extensions which are relevant, both defined in RFC 3280.

The Key Usage extension is a bit field which defines the basic purpose of the key – for example digital signature, non-repudiation and key encipherment.

The Extended Key Usage (EKU) extension further refines this, defining extra purposes for which the public key may be used. These are specified in terms of Object Identifiers (OIDs), a dotted number sequence (e.g. 1.3.6.1.4.1.343) used to represent things like code-signing, email-protection, time stamping, authentication and others. Third parties such as Microsoft may define addition EKUs.

In theory any certificate can be used. The technical aspects of signing are all the same. It is simply that PDF validation software may only accept certificates with a particular set of permitted uses.

For PDFs a key usage of digital signature, is the minimum required usage. Non-repudiation may also be necessary if that is the reason for the signature.

For Extended Key Usage it seems that Acrobat will allow a number of different EKUs including email protection, code signing and Microsoft's document signing.

If in addition you want the PDF document to validate in Acrobat by default – i.e. without explicitly setting Acrobat to trust a signer – then you will need to ensure that the certificate is issued by a vendor on the Adobe Approved Trust List (AATL).

For an illuminating discussion of the history behind the types of certificates that Acrobat will accept, see Steven Madwin’s (Adobe’s signature guru) last post here.

Signing Documents with ABCpdf

 

So how do I sign document in ABCpdf .NET? This is where we get into the nitty gritty of what you need to do.

 

First you’ll need a signing certificate…

The CA you use and the type of certificate you need depends on how the signature needs to be validated.

If you just want a blue "trusted" bar to appear in Adobe Acrobat, then you need a certificate issued by a CA trusted by Adobe. The Adobe Approved Trust List (AATL) has a list of such CAs.

For dealing with governments you may need a government-approved vendor. For example the European Commission has a list of trusted digital ID providers. You will need to prove to the CA that you are who you say are and this may include a personal interview.

Most vendors insist on issuing a signing certificate on a Hardware Security Module (HSM) in order to secure the private key. So allow time for shipping!

Of course you can use any certificates, even self-generated ones, if you manually trust them in Windows or the PDF application the recipients will be using.

What Type of Certificate Do I Need?

Again this largely depends on who is validating the signature and the purpose of the signature.

In theory any certificate can be used to sign a document because all that is really needed are the two keys. But certificates are generally issued with specific uses specified in the certificate. See Under the Hood above, for details.

For Adobe Acrobat a certificate issued for email signing is sufficient for a signature to be valid, provided the CA is trusted via the AATL.

A Digital ID, however, should really have a non-repudiation attribute. This usually means that private key has been issued in a way that is kept secure – such as on an HSM.

Certification authorities may differ in what key usages they issue a signing certificate with. PDF validation software may have different ideas about key usage.

The Code

Let's say you have a signature field named "Signature1" in a file called MyDoc.pdf.

You would like to sign the signature using a certificate from a PKCS#12 file called JohnDoe.p12 and for the output PDF to be compliant with PAdES Baseline Long Term Archival (PADES_B_LTA) standard.

For this you would write code of the following form.

Doc doc = new Doc();
doc.Read("MyDoc.pdf");
Signature sig = (Signature)doc.Form.Fields["Signature1"];
sig.Reason = "Final Version";
sig.Location = "London";
sig.TimestampServiceUrl = new Uri("http://timestamp.digicert.com");
sig.Compliance = Signature.ComplianceLevel.PAdES_B_LTA;
X509Certificate2 cert = new X509Certificate2("JoeBlogs.p12", "password",
  X509KeyStorageFlags.Exportable);
sig.Sign(cert, true, new OID(CryptoConfig.MapNameToOID("SHA256")));
doc.Save("SignedDoc.pdf");
...

Assuming the JohnDoe.p12 file holds a private key and its corresponding certificate, along with that certificate's issuing certificate, this code will:

  1. Sign the document data with the certificate (PAdES B_B)
  2. Add a timestamp to the signature using DigiCert's Timestamp Service Authority (PAdES B-T)
  3. Add a Document Secure Storage entry consisting of:
    - The certificates in the chain of the signing certificate
    - Any OCSP responses or CRLs retrieved for each certificate
    - Optionally a timestamp for when this information was added
    - The document now complies to PAdES_LT
  4. Finally add document timestamp signature to the document making the entire document compliant to PAdES B_LTA

Please note that when passing in a certificate in this way you must initialize X509Certificate2 with the X509KeyStorageFlags.Exportable storage modifier.

If you would like to use a Hardware Security Module (HSM) such as a Gemalto eToken USB key containing your key, then you can use the following Sign method:

sig.Sign(X509Certificate2 cert, SecureString password, 
  new OID(CryptoConfig.MapNameToOID("SHA256")));
...

To use this the Authentication Client, software must be set up to automatically import the certificates on the token into the Windows Certificate Store.

In the case of an HSM, the private key cannot be exported. Instead firmware on the USB key does the signing. If you pass in the correct password for the USB key via the password parameter, no prompt will appear at the signing. If you pass a null password, then a password prompt from the Authentication Client will appear.

For some cloud based HSMs there is no signature prompt available; rather an API which does the signing. This allows the private key to be kept extremely secure - well away from the code that needs it. In this case you need to set a signature callback to allow the HSM to sign the data provided by ABCpdf .NET.

In this example we use .NET to sign the data in the callback, but of course you would need to adapt this code to instead use your HSM API.

public void SignDoc() {
  // Just use public certificate from file - i.e. do not obtain from registry
  using (Doc doc = new Doc()) {
    doc.Read(@"C:\DocToSign.pdf");
    Signature sig = (Signature)doc.Form.Fields["Signature1"];
    sig.CustomSigner = ExternalSigner;
    sig.Reason = "Test External Signing";
    X509Certificate2 cert = new X509Certificate2(@"C:\GlobalSign.cer");
    sig.Sign(cert, true, new Oid(CryptoConfig.MapNameToOID("SHA512")),
        X509IncludeOption.EndCertOnly);
    doc.Save(@"C:\SignedDoc.pdf");
  }
}

byte[] ExternalSigner(byte[] data) {
  string serial = "10 20 30 10 40 10 40 50 60 10 20 30"; // needs value
  SecureString password = new SecureString(); // needs value
  X509Certificate2 cert = null;
  X509Store store = new X509Store(StoreName.My, StoreLocation.CurrentUser);
  try {
    store.Open(OpenFlags.ReadOnly | OpenFlags.OpenExistingOnly | OpenFlags.MaxAllowed);
    cert = store.Certificates.Find(X509FindType.FindBySerialNumber, serial, false)[0];
  }
  finally {
    store.Close();
  }
  if (cert.PrivateKey is RSACryptoServiceProvider == false)
    throw new Exception("Unsupported key type.");
  RSACryptoServiceProvider rsa = (RSACryptoServiceProvider)cert.PrivateKey;
  CspParameters cspParams = new CspParameters(1, rsa.CspKeyContainerInfo.ProviderName,
    rsa.CspKeyContainerInfo.UniqueKeyContainerName) {
    KeyPassword = password,
    Flags = CspProviderFlags.NoPrompt
  };
  RSACryptoServiceProvider service = new RSACryptoServiceProvider(cspParams);
  return service.SignData(data, "2.16.840.1.101.3.4.2.3"); // SHA512
} ...

So if you are using Azure KeyVault, you might use a sign function of the following form.

byte[] ExternalSigner(byte[] data) {
  var hash = (new SHA512CryptoServiceProvider()).ComputeHash(data);
  return GetKeyVaultClient().SignAsync(KeyIdentifier, JsonWebKeySignatureAlgorithm.RS512,
    hash).Result.Result;
}

Not Technical Enough?

 

Why not generate your own certificates? X-Certificate and Key Management will allow you to do this. It has a facility where you can generate a similar certificate to an existing one and it can even interact with HSMs.

Why not get inside your certificates? For examining ASN.1 encoded objects, such as raw certifi-cates and CMS signatures the lapo website provides an excellent decoder written entirely in JavaScript.

Why not OID? When you decode your ASN.1 you will likely encounter some Object Identifiers (OIDs). There are a lot of directories but we like Oid-Info.com.

Bear in mind that with some twelve years of experience here we are well placed to offer definitive advice on all aspects of this type of solution.

Should you feel that perhaps you would like to download a copy of ABCpdf .NET - Welcome to the Party!





Download Free Trial