Understanding SSL – Part 1: Certificates and Keys

Save/Share Google Yahoo! Digg It Reddit del.icio.us
My Zimbio

The technology behind Secure Sockets Layer (SSL) network connections is often perceived as a bit of “black magic” – smoke and mirrors securing our Internet connections from snooping.  When banking and shopping online, even a novice user understands their browser sets up an HTTPS connection (which is simply HTTP over SSL) to protect the transaction.  It’s easy to simply surf to a secure URL and know that, somehow, SSL is magically keeping you safe.

Developing software that uses SSL is an entirely different matter.  The simplicity quickly fades, and the developer must confront the complexities of certificate management, trust stores, handshaking, and a host of other details that must be perfectly aligned to make the secure communication work.  In Part 1, we’ll cover a very high level of SSL concepts.  In subsequent posts, we’ll take a deeper dive into making these connections happen in both Java and C#.

Understanding SSL

SSL uses public key/private key cryptography for three purposes. The most fundamental use is to encrypt data communication between the server and client.  However, it is also used to allow the server to prove its identity to the client and prevent man-in-the-middle attacks (where a malicious intermediary intercepts messages from the client and masquerades as the intended server).

A third use is for allowing the client to prove its identity to the server.  Mutual authentication is an important and powerful feature of SSL, and it’s probably underused.  For now, we’ll just focus on the semantics of server authentication.  If you understand server authentication, you’ll be well on your way to understanding client authentication on your own.

About Certificates

Three fundamental components are involved in setting up an SSL connection between a server and client: a certificate, a public key, and a private key.

Digital certificates are used to identify an entity.  The entity could be a person (when used for secure email), or it could be a computer (when used for SSL).  There is quite a bit of information stored in a digital certificate, but the most important part is the name of the entity it is identifying.

The identity, also known as the “subject”, is specified as an X.509 distinguished name.  A distinguished name contains multiple components.  For example, the distinguished name on the certificate used to setup an HTTPS connection for Amazon.com’s shopping cart check-out looks like this:

CN=www.amazon.com,

O=Amazon.com Inc.,

L=Seattle,

S=Washington,

C=US

How do we find this information?  From Internet Explorer, browse to any secure web page (one with an https:// protocol in the URL).  Right-click on the page and select “Properties”.  From the properties page, click the “Certificates” button.  On the “Details” tab, we can see all of the information embedded in the certificate.  By selecting the “Subject” entry, we can see the entity the certificate identifies:

Certificate Viewer in Internet Explorer

For the purposes of establishing an SSL connection to a server, the only interesting part of the distinguished name is the “Common Name” specified by the “CN” component.  This is the name the server uses to identify the domain name of the host.

To establish a secure connection to Amazon.com, a client first resolves the domain name www.amazon.com.  After the SSL connection has been initiated, one of the first things the server will do is send its digital certificate.  The client will perform a number of validation steps before determining if it will continue with the connection.

Most importantly, the client will compare the domain name of the server it intended to connect to (in this case, www.amazon.com) with the common name (the “CN” field) found in the subject’s identity on the certificate.  If these names do not match, it means the client does not trust the identity of the server (and the client will likely choose to terminate the connection).

Although the server name may be correct, the client must still verify the integrity of the certificate to determine if it has been forged or tampered with.  The client does this by verifying the digital signature on the certificate.  We’ll talk more about digital signatures in a moment.

The client will also ensure the certificate is being used within a valid time frame.  All certificates contain an “issue date” and an “expiration date”.  A certificate is considered invalid outside of that date range.

Public and Private Keys

Public keys and private keys are number pairs with a special relationship.  Any data encrypted with one key can be decrypted with the other.  This is known as asymmetric encryption.  The security of asymmetric encryption lies in the difficulty of cracking encrypted data even when the key used for encryption is known.

The server’s public key is embedded within its certificate.  The public key is freely distributed so anyone wishing to establish an encrypted channel with the server may encrypt their data using the server’s public key.  The server will decrypt this message using its private key.  For this reason, private keys are closely guarded and kept secure.

Digital Signatures

Just as data may be encrypted with a public key and decrypted with a private key, the reverse is also true.  Data encrypted with a private key may be decrypted with the corresponding public key.

This property of keys is used to ensure the integrity of a digital certificate in a process called digital signing.

A hashing algorithm (such as SHA1 or MD5) is a means of processing all of the bytes of a message and producing a numeric “hash value”.  Hash values have long been used to ensure the integrity of messages that may become corrupted during transport.  A sender will transmit a message followed by the hash value it calculated for the message it intends to send.  The receiver calculates the hash value for the message it receives.  If the receiver calculates a different hash value than the one that was sent, the receiver concludes the message was corrupted during transit (and generally asks the sender to resend).

A hashing algorithm may also be used to determine if a message has been forged or tampered with.  Complicating this, however, is that a malicious third party could intercept a message, modify it, and simply recalculate the hash.  Asymmetric encryption technology solves this.

If a message sender wants to convince a recipient that his message is authentic and has not been tampered with, he will do two things.  First, the sender will calculate a hash value for the message.  This hash value will then be encrypted using the sender’s private key (a key which the sender, and only the sender, knows).  When the client receives the message, the client decrypts the hash value using the sender’s public key.  If the message has been tampered with, or if the message has been signed with anything other than the sender’s private key, the hash values will not agree and the client will not consider the message authentic.

Certificate Signing

When a certificate is created, it is digitally signed.  The digital signature is used to verify the authenticity of the certificate.  In an SSL connection, the client will attempt to verify the signature on the certificate presented by the server before deciding to continue establishing the connection.

Self-signed certificates

The simplest certificate is a self-signed certificate.  The signature on a self-signed certificate is calculated using the same private key associated with the public key found on the certificate.  In a software development environment, self-signed certificates are an easy way to build testing environments that establish SSL connections without having to deal with the time and expense of obtaining a certificate through an establish certificate authority.

Certificate Authority signed certificates

Certificates may also be signed by a certificate authority (CA).  A CA is a trusted third party that digitally signs certificates for entities that have gone through an established vetting process.  The CA, itself, also has a certificate that can be analyzed for authenticity – and the CA’s certificate might also be signed by yet another trusted third party (in this case, the CA is known as an intermediate CA).  All of these certificates, together, form a certificate chain.  At the top of the chain is a certificate authority called a Root CA that uses a self-signed certificate.

Returning to the Amazon.com example, we can examine the server certificate and see that it is not a self signed certificate – it is signed by a CA with a distinguished named of:

CN=VeriSign Class 3 Secure Server CA – G2,

OU=Terms of use at https://www.verisign.com/rpa (c)09,

OU=VeriSign Trust Network,

O=VeriSign, Inc.,

C=US

From the certificate details tab we looked at earlier, we can find the name of the certificate issuer by selecting the “Issuer” entry:

Certificate Viewer in Internet Explorer

VeriSign is a trusted third party that issues certificates for, among other things, eCommerce applications.

Why is it important to have a certificate signed by a trusted third party?  It’s important because new HTTPS-based Web applications are being deployed all of the time.  In terms of browser-based, retail eCommerce applications, it’s simply impractical for users to manage a list of all of the server certificates they have decided to “trust”.  Furthermore, reliably and securely obtaining a web site’s real server certificate would be too problematic.

Consider a new Amazon.com customer.  When the shopping cart checkout sends Amazon.com’s certificate identifying itself as www.amazon.com, how does the customer know to trust the certificate?  Although the certificate may have a valid name and signature on it, how does the customer decide to trust the certificate?  If it is self-signed, it could have been signed by anyone – including a malicious man-in-the-middle.  What the customer needs is a reliable means of receiving Amazon’s certificate that is protected from forgery.

To do this, Amazon chose not to use a self-signed certificate.  Instead, for a fee, it requested that VeriSign sign its online shopping certificate.  When the customer receives Amazon.com’s certificate, she says “I trust that I have a legitimate copy of VeriSign’s CA certificate, I trust VeriSign to only sign certificates of the real domain name owners, and I can see that Amazon.com’s certificate is signed by VeriSign.  Therefore, I believe the server responding at www.amazon.com is truly owned and managed by the same entity owning the www.amazon.com domain name.”

Where Do Trusted CA’s Come From?

Why does the customer believe she has a legitimate copy of VeriSign’s CA certificate?  She believes this because a set of “trusted” CAs came pre-installed with her Web browser software.  Internet Explorer, Firefox, and all other leading browser vendors pre-configure their browsers to trust well known CAs such as VeriSign, Thawte, and Network Solutions.

C# programs will generally access the Windows Cryptographic Service Provider for trusted certificates.  The CSP is a shared OS resource usable by any program, and is also the same trust store consulted by Internet Explorer.

Similarly, the Java Runtime Environment comes with a pre-configure set of trusted certificate authorities.  The collection of trusted certificates can be found at [JRE_HOME]/lib/security/cacerts.  The keytool, a command line utility found in the SDK, can be used to inspect and manipulate this file.  The default password for the cacerts keystore is “changeit”.

Summary

Public key cryptography is at the heart of SSL.  While most developers are aware this technology is used to encrypt a data channel, most are unfamiliar with its use of digital signing for identity authentication and message validation.  It is the lack of understanding of these other uses that generally stymie their efforts to implement SSL.

Stayed tuned for more posts on the lower level details of implementing SSL technology,

Save/Share Google Yahoo! Add to Technorati Favorites Digg It Reddit
del.icio.us My Zimbio

5 Responses to “Understanding SSL – Part 1: Certificates and Keys”

  1. Abbner Torres Says:

    Mike what a wonderful blog you have, congratulations.
    Abbner

  2. Sunder Tatta Says:

    A very nice explanation of the SSL concepts. Well done Sir!

  3. Mattius Says:

    I look forward to your more detailed entries as I am tearing my hair out trying to get the OpenSource AS3Crypto lib for flash to connect to a C# SslStream!

  4. Alex Says:

    Congratulations, this blog is very useful. Thanks a lot.

  5. Ankit@Android Says:

    This helped me in creating aX509 distinguished name. Thank you

Leave a Reply