Production considerations when running a certificate authority server

step-ca is built for robust certificate management in distributed systems. As with any entity in your infrastructure, running step-ca effectively in production requires some knowledge of its strengths and limitations. This document addresses the important production considerations that operators should know about when running step-ca as a certificate authority server.

Overview

Good Security Practices

In this section we recommend a few best practices when it comes to running, deploying, and managing your own online certificate authority server and PKI. Security is a moving target. We expect our recommendations to change and evolve as well.

Safeguard Your Root And Intermediate CA Keys

When you initialize a two-tier CA, two private keys are generated: one intermediate private key, and one root private key. It is very important that these private keys be kept secret.

The intermediate key is used by the CA to sign certificates. The root key is not needed for day-to-day CA operation and should be stored offline. The keys can be generated on an air-gapped device or on a Hardware Security Module (HSM).

Here's an example key protection strategy for a high-security production PKI.

In this example, step-ca acts as a subordinate CA to an offline root CA.

  1. Generate a root CA (private key and certificate) on a Hardware Security Module (HSM) or air-gapped device that is kept in "cold storage", off the internet. HSMs are ideal for storing private keys and performing signing operations securely. For durability, keep at least two copies of your root key, in separate locations.
  2. Generate intermediate key(s) on a separate, online cloud HSM or in a key management service (KMS) that will be used by the CA for signing operations in production
  3. Generate Certificate Signing Requests (CSRs) for your intermediate CA(s)
  4. Sign the generated CSR using the root HSM
  5. Configure step-ca to use the signed root and intermediate certificates
  6. Configure step-ca to access the cloud HSM or KMS intermediate key for online signing operations

See the Cryptographic Protection section of our Configuration Guide to learn more about your options for using HSMs or cloud KMS with step-ca.

Use Strong Passwords and Store Them Well

When you initialize your PKI, the root and intermediate private keys will be encrypted with the same password.

Use a password manager to generate random passwords, or let step ca init generate a strong password for you.

After initializing your CA, we recommend that you immediately change the password for the intermediate CA private key:

step crypto change-pass $(step path)/secrets/intermediate_ca_key

You'll use this new intermediate key password to start step-ca.

Once you've changed the intermediate private key password, you should never have to use the root private key password again. So, then what should you do with it?

Bury it in a cave high in the mountains.

Or, store it in a password manager or secrets manager. There are many to choose from and the choice will depend on the risk & security profile of your organization.

In addition to using a password manager to store all passwords (private key, provisioner password, etc.) we recommend using a threshold cryptography algorithm like Shamir's Secret Sharing to divide the root private key password across a handful of trusted parties.

Avoid Storing Passwords in Environment Variables

systemd discourages using the environment for secrets because it doesn't consider it secure and exposes a unit's environment over dbus. From systemd.exec(5):

Note that environment variables are not suitable for passing secrets (such as passwords, key material, …) to service processes. Environment variables set for a unit are exposed to unprivileged clients via D-Bus IPC, and generally not understood as being data that requires protection. Moreover, environment variables are propagated down the process tree, including across security boundaries (such as setuid/setgid executables), and hence might leak to processes that should not have access to the secret data.

For some isolated environments, we could see an argument for the convenience of an environment variable. Even then, there can be subtle issues. For example, anyone with access to the Docker daemon can view all of the environment variables of running Docker containers, using docker inspect.

For posterity, however, if you've secured your environment and rely on it for secrets, there is a way to pass a password into step-ca from an environment variable in Bash:

step-ca --password-file <(echo -n "$STEP_CA_PASSWORD") $(step path)/config/ca.json

This method is known as Bash Process Subsitution, and on most systems the password will not appear in ps output. However, this approach is not recommended simply because it's so difficult to ensure security with environment variables.

See our blog How to Handle Secrets on the Command Line for an in-depth exploration of this topic.

Replace Your Default Provisioner

When you initialize your PKI, a default JWK provisioner will be created. If you're not going to use this provisioner, we recommend deleting it.

If you are going to use this provisioner, secure it with a different password than your CA's signing keys. You can do this by passing separate --password-file and --provisioner-password-file files when running step ca init.

See the section on managing provisioners.

Use Short-Lived Certificates

We recommend the following:

  • User certificates should have the lifespan of a mayfly: about a day or less1.
  • Host or service account certificates should have a lifetime of one month or less.

Certificates from step-ca expire in 24 hours by default. We made it easy for you to automate the renewal of your certificates using the step command. Carpe diem!

You can configure certificate lifetimes in the ca.json file.

1
One mayfly species, dolania americana, lives for five minutes or less! So do some certificates. But it can be difficult to operationalize such short-lived certificates.

Enable Active Revocation On Your Intermediate CA

The value of a two-tiered PKI is that you can add your root CA certificate to the certificate trust store on all of your nodes, and store your root private key completely offline. A leaf certificate signed by the CA always comes in a bundle that contains the intermediate CA certificate alongside the leaf certificate. With this bundle, any client that trusts your root CA can verify the complete chain of trust.

Now, what if one day your intermediate CA key is compromised? You could issue a new intermediate using your root CA key, but your old intermediate has a 10 year validity period! So, you're stuck having to rotate your root CA too, and that's a much bigger project because you have to distribute the new root certificate to everyone and ensure the old one is no longer trusted. To avoid this scenario, you can use active revocation on your intermediate CA certificate, making it possible to immediately revoke a compromised intermediate.

While step-ca doesn't directly support active revocation mechanisms like Certificate Revocation Lists (CRLs) or the Online Certificate Status Portocol (OCSP), you can independently manage your own CRL if you like.

Create an intermediate CA with a CRL distribution endpoint

Let's make it possible to revoke your intermediate CA down the road if necessary. This setup is more complex than the default step-ca PKI, but it offers an insurance policy for a compromised intermediate CA.

  1. Create an intermediate CA that includes a CRL endpoint. Save the following template to intermediate.tpl:

    { "subject": {{ toJson .Subject }}, "keyUsage": ["certSign", "crlSign"], "basicConstraints": { "isCA": true, "maxPathLen": 0 }, "crlDistributionPoints": ["http://crl.example.com/crl/ca.crl"] } }

    You'll need this template to manually create your intermediate CA. The CRL endpoint here should be an HTTP URL; the CRL file itself is signed. The CRL will be a static file, so you you might choose an object storage or CDN endpoint here.

    Use the template to create your intermediate CA. You will need your root CA certificate and key:

    $ step certificate create "Example Intermediate CA" \ $(step path)/certs/intermediate_ca.crt \ $(step path)/secrets/intermediate_ca_key \ --template intermediate.tpl \ --ca $(step path)/certs/root_ca.crt \ --ca-key $(step path)/secrets/root_ca_key \ --not-after 87660h
  2. Create an empty CRL file and sign it with your root CA key:

    cat <<EOF > openssl.conf [ ca ] default_ca = CA_default [ CA_default ] default_crl_days = 30 database = index.txt default_md = sha256 EOF touch index.txt openssl ca \ -config openssl.conf \ -gencrl \ -keyfile $(step path)/secrets/root_ca_key \ -cert $(step path)/certs/root_ca.crt \ -out ca.crl.pem openssl crl \ -inform PEM \ -in ca.crl.pem \ -outform DER \ -out ca.crl
  3. Upload the DER-formatted ca.crl file to the distribution point URL you specified in the template.

  4. Finally, configure your step-ca server to use the intermediate CA you created.

Revoke A Certificate

To revoke a certificate, add it to the index.txt file before regenerating the CRL file. The format for this CRL database file is:

  • One certificate per line
  • Each line is tab-delimited
  • The tab-delimited fields are:
    1. Entry type. May be V (valid), R (revoked) or E (expired). An expired certificate may have the type V because the type has not been updated. openssl ca updatedb does such an update.
    2. Expiration datetime. Format is yymmddHHMMSSZ.
    3. Revokation datetime and optional revocation reason. Must be set for any entry of the type R. Format is yymmddHHMMSSZ[,reason].
    4. Certificate serial number in uppercase hexidecimal, eg. 804A72D941DB451A0123BA4706446D1F.
    5. File name This doesn't seem to be used, ever, so use the value unknown.
    6. Certificate subject eg. CN=Test Intermediate CA,O=Smallstep Labs\, Inc

Use Templates With Care

Because certificate templates add a lot of flexibility to step-ca, they can be a source of subtle vulnerabilities in your PKI. If you use custom certificate templates, be sure they are tightly restricted for your use case. Use extreme caution when referencing user-supplied data marked as Insecure in templates.

Create a Service User to Run step-ca

Make sure that the configuration folders, private keys, and password file used by the CA are only accessible by this user. If you're running step-ca on port 443, you'll need the step-ca binary to be able to bind to that port. See Running step-ca as a Daemon for details.

Running step-ca as a Daemon

Note: This section requires a Linux OS running systemd version 245 or greater.

  1. Add a service user for the CA.

    The service user will only be used by systemd to manage the CA. Run:

    $ sudo useradd --system --home /etc/step-ca --shell /bin/false step

    If your CA will bind to port 443, the step-ca binary will need to be given low port-binding capabilities:

    $ sudo setcap CAP_NET_BIND_SERVICE=+eip $(which step-ca)
  2. Move your CA configuration into a system-wide location. Run:

    $ sudo mv $(step path) /etc/step-ca

    Make sure your CA password is located in /etc/step-ca/password.txt, so that it can be read upon server startup.

    You'll also need to edit the file /etc/step-ca/config/defaults.json to reflect the new path.

    Set the step user as the owner of your CA configuration directory:

    $ sudo chown -R step:step /etc/step-ca
  3. Create a systemd unit file.

    $ sudo touch /etc/systemd/system/step-ca.service

    Add the following contents:

    [Unit] Description=step-ca service Documentation=https://smallstep.com/docs/step-ca Documentation=https://smallstep.com/docs/step-ca/certificate-authority-server-production After=network-online.target Wants=network-online.target StartLimitIntervalSec=30 StartLimitBurst=3 ConditionFileNotEmpty=/etc/step-ca/config/ca.json ConditionFileNotEmpty=/etc/step-ca/password.txt [Service] Type=simple User=step Group=step Environment=STEPPATH=/etc/step-ca WorkingDirectory=/etc/step-ca ExecStart=/usr/bin/step-ca config/ca.json --password-file password.txt ExecReload=/bin/kill --signal HUP $MAINPID Restart=on-failure RestartSec=5 TimeoutStopSec=30 StartLimitInterval=30 StartLimitBurst=3 ; Process capabilities & privileges AmbientCapabilities=CAP_NET_BIND_SERVICE CapabilityBoundingSet=CAP_NET_BIND_SERVICE SecureBits=keep-caps NoNewPrivileges=yes ; Sandboxing ProtectSystem=full ProtectHome=true RestrictNamespaces=true RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 PrivateTmp=true PrivateDevices=true ProtectClock=true ProtectControlGroups=true ProtectKernelTunables=true ProtectKernelLogs=true ProtectKernelModules=true LockPersonality=true RestrictSUIDSGID=true RemoveIPC=true RestrictRealtime=true SystemCallFilter=@system-service SystemCallArchitectures=native MemoryDenyWriteExecute=true ReadWriteDirectories=/etc/step-ca/db [Install] WantedBy=multi-user.target

    (This file is also hosted on GitHub)

    Here are some notes on the security properties in this file:

    • User and Group cause step-ca to run as a non-privileged user.

    • AmbientCapabilities allows the process to receive ambient capabilities. CAP_NET_BIND_SERVICE allows the process to bind to ports < 1024. See capabiliites(7).

    • CapabilityBoundingSet limits the set of capabilities the process can have.

    • SecureBits allows the service to keep its capabilities even after switching to the step user.

    • NoNewPrivileges ensures no future privilege escalation by the process.

    • ProtectSystem and ProtectHome configure sandboxing via a read-only file system namespace dedicated to the process.

    • ProtectNamespaces prevents the process from creating kernel namespaces.

    • RestrictAddressFamilies prevents the service from allocating esoteric sockets such as AF_PACKET.

    • PrivateTmp gives the service its own private /tmp.

    • PrivateDevices presents a very limited /dev to the service.

    • Protect* limits access to system resources.

    • LockPersonality locks the process's execution domain.

    • RestrictSUIDSGID restricts setuid/setgid file creation.

    • RemoveIPC removes any IPC objects created by the service when it is stopped.

    • RestrictRealtime restricts real-time scheduling access.

    • SystemCallFilter defines an allow list of system calls the service can use.

    • SystemCallArchitectures restricts the service to only be able to call native system calls.

    • MemoryDenyWriteExecute prevents the service from creating writable-executable memory mappings.

    • ReadWriteDirectories ensures that the process can write its state directories.

  4. Enable and start the service.

    The following are a few useful commands for checking the status of your CA, enabling it on system startup, and starting your CA.

    # Rescan the systemd unit files $ sudo systemctl daemon-reload # Check the current status of the step-ca service $ sudo systemctl status step-ca # Enable and start the `step-ca` process $ systemctl enable --now step-ca # Follow the log messages for step-ca $ journalctl --follow --unit=step-ca

High Availability

A few things to consider / implement when running multiple instances of step-ca:

  • Use a MySQL database. The default Badger database has no concurrency support. The only integrated DB that can support multiple instances is MySQL. See the database documentation to learn how to configure step-ca for MySQL.
  • Respect concurrency limits. The ACME server has known concurrency limitations when using the same account to manage multiple orders. The recommended temporary workaround is to generate an ephemeral account keypair for each new ACME order, or to ensure that ACME orders owned by the same account are managed serially. The issue tracking this limitation can be found here.
  • Synchronize ca.json across instances. step-ca reads all of it's configuration (and all of the provisioner configuration) from the ca.json file specified on the command line. If the ca.json of one instance is modified (either manually or using a command like step ca provisioner (add | remove)) the other instances will not pick up on this change until the ca.json is copied over to the correct location for each instance and the instance is sent SIGHUP or restarted. It's recommended to use a configuration management tool (ansible, chef, salt, puppet, etc.) to synchronize ca.json across instances.

Load balancing or proxying step-ca traffic

If you need to place a load balancer or reverse proxy downstream from the CA, we recommend using layer 4 (TCP) load balancing or proxying (aka "TLS passthrough").

We also recommend enabling remote provisioner management so that your provisioner configuration can be shared by all step-ca instances.

Layer 7 proxying is not recommended, becase the step toolchain is built around TLS:

  • step expects to be able to perform a TLS handshake with step-ca, using the CA's root certificate to complete the trust chain.

  • Certificate renewal uses mutual TLS by default. step-ca authenticates the client using the expiring certificate, in order to issue a new one. Mutual TLS requires a direct, end-to-end TLS handshake between step and step-ca. (See step ca renew --help for more details.)

  • By design, step-ca does not have an option to run in HTTP only. Philosophically, we value perimeterless security and we believe people should use authenticated encryption (e.g. mutual TLS) everywhere. Making mTLS easy, and helping people get away from the "perimeter security" anti-pattern, are motivating goals behind the project.

    That said, lots of folks have legacy issues to contend with, some of these decisions are out of their control, and every threat model is different. See certificates#246 for more details.

Further Reading

  • Nginx has a stream module that allows it to pass TLS traffic directly to step-ca. But it comes with a price: Unlike typical reverse proxy configurations, source IPs are not visible to step-ca (there is no X-Forwarded-For header), and traffic is also not logged to the nginx access log. See this blog post for an example of TLS passthrough.
  • Caddy doesn't natively support TLS passthrough, but there is an experimental caddy-l4 module that can do it.

Automate X.509 Certificate Lifecycle Management

We recommend automating certificate renewal when possible. Renewal can be easily automated in many environments. See our Renewal documentation for details.

X.509 Certificate Revocation

By default, step-ca uses passive revocation. Certificates can be revoked using the step ca revoke subcommand. See our Revocation documentation for details.

Sane Cryptographic Defaults

The step ecosystem uses sane defaults so that you don't have to be a security engineer to use our step-ca safely. Our defaults align with best current practices in the industry for using cryptographic primitives and higher order abstractions, like JWTs.

This section describes our defaults and explains the rationale behind them. Our selections and guidance will change and evolve over time as security and cryptography are constantly changing in response to real world pressures.

Tokens

We use JWTs (JSON Web Tokens) to prove authenticity and identity within the step ecosystem. When configured well, JWTs are a great way to sign and encode data. It's easy to use JWTs insecurely, though, so you must be deliberate about how you validate and verify them (see RFC7519).

step-ca produces JWTs that:

  • are short-lived (5 minute lifespan)
  • are one-time-use tokens (during the lifetime of the step-ca)
  • have a 1 minute clock drift leeway

If you're using step-ca JWTs in your code, be sure to verify and validate every standard attribute of the JWT. step crypto jwt verify can validate any JWT for you, and it follows the spec to the letter.

Key Types and Ciphers

Supported Key Types: ECDSA, EdDSA, and RSA
Default Key Type: ECDSA
Default Curve Bits: P-256

We chose ECDSA keys because they offer better security and performance than RSA keys. At 256 bits, ECDSA keys provide 128 bits of security, and they are supported by most modern clients.

More notes on the choice of key type:

  • RSA keys are often chosen for compliance reasons.
  • EdDSA keys are even smaller and faster than ECDSA keys. Were it supported by more clients, it would be the default.
  • The NIST standard curves for ECDSA are hard to implement correctly, so there's concern that the implementations of them may have problems.
  • If the NSA is in your threat model, you may not want to use ECDSA keys. The NSA has never published how they chose the magic numbers that drive ECDSA implementations.

Default PEM Cipher: AES128
Supported PEM Key Sizes: 128, 192, and 256 bits

We've chosen the AES encryption algorithm for writing private keys to disk because it was the official choice of the Advanced Encryption Standard contest.

All supported key sizes are considered to be unbreakable for the foreseeable future. We chose 128 bits as our default because the performance is better as compared to the greater key sizes, and because 128 bits are sufficient for most security needs.

X.509 Certificates

Root CA Certificate

The Root CA certificate is generated once, when you run step ca init.

Validity (10 year window)

  • Not Before: Now
  • Not After: Now + 10 years

A 10 year window is advisable until software and tools can be written for rotating the root certificate.

Basic Constraints

  • CA: TRUE

    The root certificate is a certificate authority and will be used to sign other Certificates.

  • Path Length: 1

    The Path Length constraint expresses the number of possible intermediate CA certificates in a path built from an end-entity certificate up to the CA certificate.

    The default step PKI has only one intermediate CA certificate between end-entity certificates and the root CA certificate.

Key Usage

Key Usage describes how the certificate can be used.

  • Certificate Sign: indicates that our root public key will be used to verify a signature on certificates.
  • CRL Sign: indicates that our root public key will be used to verify a signature on revocation information, such as CRL.
Intermediate CA Certificate

The Intermediate CA certificate is generated once, when you run step ca init. It is signed by the Root CA certificate.

The Path Length of the intermediate certificate is 0. Otherwise it uses the same defaults as the root certificate.

A Path Length of zero indicates that there can be no additional intermediary certificates in the path between the intermediate CA certificate and end-entity certificates.

Leaf (End Entity) Certificate

These are the certificates issued by the step-ca server.

Validity (24 hour window)

  • Not Before: Now
  • Not After: Now + 24 hours

The default is a 24hr window. This value is somewhat arbitrary. However, our goal is to have seamless end-entity certificate rotation. Rotating certificates frequently is a good security measure because it gives attackers very little time to form an attack and limits the usefulness of any single private key in the system.

We will continue to work towards decreasing this window because we believe it significantly reduces the probability and effectiveness of any attack.

Key Usage

Key Usage describes how the certificate can be used.

  • Key Encipherment: indicates that a certificate will be used with a protocol that encrypts keys.
  • Digital Signature: indicates that this public key may be used as a digital signature to support security services that enable entity authentication and data origin authentication with integrity.

Extended Key Usage

  • TLS Web Server Authentication: certificate can be used as the server side certificate in the TLS protocol.

  • TLS Web Client Authentication: certificate can be used as the client side certificate in the TLS protocol.

TLS Defaults

These are the defaults used for communication between step and step-ca.

Min TLS Version: TLS 1.2
Max TLS Version: TLS 1.2

The PCI Security Standards Council required all payment processors and merchants to move to TLS 1.2 and above by June 30, 2018. By setting TLS 1.2 as the default for all TLS protocol negotiation, we encourage our users to adopt the same security conventions.

Renegotiation: Never

TLS renegotiation significantly complicates the state machine and has been the source of numerous, subtle security issues. Therefore, by default we disable it.

Default TLS Cipher Suites
[ "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305", "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256", ]

The default 'ciphersuites' are a list of two cipher combinations. For communication between services running step there is no need for cipher suite negotiation. The server can specify a single cipher suite which the client is already known to support.

Reasons for selecting TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305:

  • ECDHE key exchange algorithm has perfect forward secrecy
  • ECDSA has smaller keys and better performance than RSA
  • CHACHA20 with POLY1305 is the cipher mode used by Google.
  • CHACHA20's performance is better than GCM and CBC.

The http2 spec requires the TLS_ECDHE_(RSA|ECDSA)_WITH_AES_128_GCM_SHA256 ciphersuite be accepted by the server, therefore it makes our list of default ciphersuites.

Approved TLS Cipher Suites
[ "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA", "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256", "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256", "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA", "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305", "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA", "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256", "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256", "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA", "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305", ]

Above is a list of step-approved cipher suites. Not all communication can be resolved with step TLS functionality. For those connections, the list of server supported cipher suites must have more options in case older clients do not support our favored cipher suite.

Reasons for selecting these cipher suites can be found in the following ssllabs article.