In the 12+ years that CSS has been helping organizations deploy Public Key Infrastructures, we frequently run into situations where PKI components are already present in the environment. Often it’s an older PKI that someone new to the organization has inherited and wants help evaluating; sometimes it’s a “temporary” deployment that an organization is looking to improve upon. In others, it may simply be a PKI design that a customer wants us to review and provide feedback before deployment. In any case, these “Do-It-Yourself” installations, like any PKI, can create problems, headaches, and occasionally even more serious issues if mistakes are made during the design, deployment, or operation of the PKI. And while it’s often quite easy to deploy PKI components, PKI does tend to be one of those technologies where you have exactly one chance to get it right: at install time. After that, many parameters are more or less set in stone, and a re-deployment becomes the only way to fix a mistake.
With that in mind, this is in no way an all-inclusive list, but here are five of the most common mistakes we see when encountering “DIY” PKI:
Over-Architecting the CA Hierarchy
With many IT systems, a picture truly is worth a thousand words, and a diagram of the architecture’s “boxes and lines” makes up a majority of the design. This is the case for network diagrams, web/database application architectures, directory structures, and many other IT components. Following this mindset, many first-time PKI architects focus almost exclusively on the hierarchy of CAs: the makeup of Offline Root(s), Policy/Intermediate CAs, and Online Issuing CAs that will comprise the PKI.
However, while the CA hierarchy of a PKI is important to the design, it’s not the whole story. In fact, it’s usually not even the majority of the story, when it comes to scalability, availability, usability, or longevity. The “boxes and lines” get all the attention, but there are SO many other design decisions that are of equal or greater importance. Policies, security controls, CRL and revocation planning, algorithms and key sizes, and validity periods are some examples of areas that can have a more significant, lasting impact on PKI usefulness than the hierarchy of CAs.
An overly-high level of focus on the PKI hierarchy can lead to a tendency to include more “boxes and lines” – resulting in designs that involve more CAs than necessary. These additional CAs come at a cost: server and HSM hardware, OS licenses, in addition to the operational expense of having more systems to monitor.
Under-Architecting Everything Else
Sometimes, it’s just too easy to click “Next.”
Unfortunately, with PKI, there are a large number of design aspects that, once configured, cannot be changed without a complete re-deployment, and as mentioned above, many of these can have a significant impact on the long-term usefulness of your PKI. These include:
- CA certificate names, key sizes, and signing algorithms: These are a part of each CA’s certificate, and can’t be changed. And since PKI components are around for many years, the ramifications of the choices can be very significant.
- CRL Distribution Points (CDP): CRL locations get put in the issued certificates, so changing them means you have to re-issue every cert.
- Operational policies and targeted assurance levels: The moment you issue your first certificate – whether you planned to or not – you just set the issuance policy for your CA. And once you’ve set the bar at a certain level, you can only lower it.
Far too many PKIs get deployed with lower-than-desired security controls, in the interest of saving time or avoiding operational effort. It’s important to strike a good balance between ease of operation and a sufficiently high level of assurance.
PKI enjoys a well-defined structure for policy and practices definition, in the form of Certificate Policy (CP) and Certification Practices Statements (CPS). These are excellent frameworks for defining the requirements governing a PKI, and the means by which an implementation would meet those requirements. Creating these documents can be a daunting task. However, it’s important to note that simply copying someone else’s set of CP/CPS documents verbatim will not suffice; these tools only have value if they truly represent your organization’s PKI requirements and operational processes.
Lack of Certificate Lifecycle Planning
Developing an issuance process – which is also secure -- can take a significant amount of planning. And if the PKI is being used for embedded systems or network-enabled products or devices (the so-called “Internet of Things”), developing a secured, high-volume issuance process is also critical.
A common mistake when deploying certificate-enabled applications, however, is to focus only on the rollout, and the tasks involved with the initial application deployment. But certificates do expire, and if planning doesn’t include the entire certificate lifecycle, major problems can result. The unexpected and unhandled expiration of certificates can cause significant outages and expense.
This concern isn’t exclusively tied to certificate expiration, either. Depending on the application involved, planning for other certificate lifecycle events such as revocation or key archival and recovery processes can be even more important than certificate renewal planning.
Misplaced Availability Planning
Availability is a key focus area for any IT design, but in some ways, PKI puts a “twist” on availability that is sometimes not well understood. In the same way that your driver’s license is still valid and usable when the DMV is closed, digital certificates are still valid and usable when a CA goes down. The only thing you can’t do when a CA is down is issue new certificates, which for most organizations, is not as important as ensuring that the existing certs still work.
Despite this distinction, there is often still more focus on CA availability than CRL or OCSP availability, which are always at least as important. If your CRLs or OCSP servers are not available, all of your certificates can become unusable. Creating additional issuing CAs “for backup” sometimes solves a problem that doesn’t need to be solved, while placing a bigger burden on CRL and OCSP availability.
The “Templates of Doom”
Most of the information in this blog applies to just about any PKI, regardless of the CA software used.
This last item, however, applies more directly to Microsoft-based CAs. We see a majority of our customers using Microsoft CAs as a PKI building block, partly because of the features and capabilities of the software, and partly because they already own the licenses. But there are a few areas, particularly regarding the default certificate templates, where can get users into trouble.
A “next, next, next” installation of a Microsoft CA will result in a CA that’s already configured to issue a number of different certificates; one such template is “Domain Controller,” which is used by Domain Controllers to authenticate communications with each other. Interestingly, Domain Controllers are special, in that they are pre-programmed to continually seek out CAs in the environment that can issue Domain Controller certificates, and when they find one, they automatically enroll. This may or may not be a problem, but it’s often not expected behavior. Unless steps are taken to avoid it, after the first CA installation into an Active Directory forest, every Domain Controller in the forest will have a certificate within 90 minutes.
Another default template that can have adverse consequences is the “User” certificate template. This has a temptingly appealing name, and often is set up for issuance to large numbers of enterprise users. However, this template combines a number of different use cases, to include authentication, Microsoft EFS, and email encryption. These certificates allow users to encrypt files and email, but do not include any provision for key archival, which can put your organization at risk for loss of data if the certificates are later lost or deleted.