[enhancement]: Support for an encrypted metadata/userdata workflow via the Datasource interface

# Enhancement

There is often a need to bootstrap a guest with information that may be classified as sensitive, ex. a public/private key pair written out as files. While it is true that every datasource _could_ develop their own means of handling this problem, it occurred to me recently that perhaps they do not have to do so.

I've been busy of late digging into how to leverage TPMs as a means to securely bootstrap machines (see https://github.com/google/go-tpm/pull/343 and https://github.com/vmware/govmomi/pull/3222), and developed a workflow for bootstrapping guests that leverages the TPM's existing endorsement key (EK):

1. Client creates a shared secret used to encrypt bootstrap data. This shared secret is no larger than 128 bytes as it is the recommended `MAX_SYM_SIZE` for portability with respect to TPMs. 
2. Client encrypts bootstrap data using this shared secret.
3. Client encrypts shared secret using the public EK from the target machine's TPM.
4. Client shares/injects the encrypted shared secret and encrypted bootstrap data with the machine and powers it on.
5. Bootstrap engine decrypts the shared secret using the TPM's EK.
6. Bootstrap engine decrypts the bootstrap data using the decrypted, shared secret.

A secure bootstrap model has been [demonstrated before](https://apps.dtic.mil/sti/pdfs/AD1034655.pdf), which eventually became [Keylime](https://github.com/keylime/keylime). I think Keylime is great, and there is nothing in the above workflow that precludes leveraging Keylime for on-going, secure communication with a guest, either from/to Cloud-Init or other actors. 

However, the bootstrap problem is one that I think can be addressed without requiring key exchange, either explicit attestation or storage, and instead rely on a simplified version (there may or may not be an auth policy that relies on a specific combination of PCRs) of implicit attestation via the EK, something every TPM already has or has the ability to determinstically generate. To that end, if an EK-based bootstrap model _were_ adopted, it could be something that Cloud-Init directly utilized as opposed to leaving it up to every datasource implementation. For example, the [Datasource interface](https://github.com/canonical/cloud-init/blob/main/cloudinit/sources/__init__.py) could be augmented with:

```python
    def _get_encrypted_data(self) -> bool:
        """Walk metadata sources, process crawled data and save attributes."""
        raise NotImplementedError(
            "Subclasses of DataSource must implement _get_encrypted_data which"
            " sets self.encrypted_metadata, encrypted_vendordata_raw and encrypted_userdata_raw."
        )

    @property
    def encrypted_shared_secret(self):
        """Returns the encrypted shared secret that is used to decrypt one 
        or all of metadata, userdata, and vendordata."""
        return None

    @property
    def encyption_type(self):
        """Returns the type of encryption used with the shared secret to decrypt
         one or all of metadata, userdata, and vendordata."""
        return None

```

With something like the above, it could be up to Cloud-Init core to handle interfacing with the TPM to decrypt the shared secret via the [example outlined in `tpm2-ekunseal.sh`](https://github.com/akutz/go-tpm/blob/feature/enc-to-ek-sans-tpm/examples/tpm2-ekseal/tpm2-ekunseal.sh). I imagine shell execing to [tpm2-tools](https://github.com/tpm2-software/tpm2-tools) is a better option than requiring Cloud-Init to depend on [tpm2-pytss](https://github.com/tpm2-software/tpm2-pytss) directly. Plus, this way it keeps the use of encryption as optional without adding another dependency to Cloud-Init's Python modules.

Anyway, I am quite happy to discuss this further if there is interest. Thanks!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement]: Support for an encrypted metadata/userdata workflow via the Datasource interface #4417

Enhancement

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[enhancement]: Support for an encrypted metadata/userdata workflow via the Datasource interface #4417

Description

Enhancement

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions