Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker access problems with Apache2 #2914

Open
famez opened this issue Nov 17, 2024 · 14 comments
Open

Broker access problems with Apache2 #2914

famez opened this issue Nov 17, 2024 · 14 comments

Comments

@famez
Copy link

famez commented Nov 17, 2024

Hello,

Frankly, I don't know where to post this issue, as I don't know if this is a problem from the Apache2 architecture, openssl, TPM2 provider or the TSS2 implementation.

I configured apache2 to use the TPM2 as provider and use a private key imported in the TPM2 and encoded as (-----BEGIN TSS2 PRIVATE KEY-----).

When starting the application, seems that everything is ok, but with Chrome based browser, sometimes, I start to obtain the following error:

image

After hours spent debugging, I (think) found the problem.

Chrome based browsers make 2 concurrent connections (TLS 1.1 and TLS 1.2) at the same time for the same HTTPS request, so 2 instances from apache2 may process these 2 requests concurrently. If we do several requests at the same time, the bug also happens in any browser, but it is easier to spot on Chrome due to the 2 TLS connection for just one request.

When processing concurrently, the Apache2 server behaves differently depending on the MPM configuration:

  • PREFORK: Apache spawns several child processes and the main process dispatch the incoming request to the children.
  • Worker/Event: Same as before, but thread based, instead of process based.

Apache initializes openssl context on the main process, this means that the context of the TPM 2.0 created in the provider is created in the main process, then depending on the MPM configuration, it dispatches the requests to the child threads/process or a mix of them (threads inside different child processes). This means that the threads and processes inherit the TPM2 context as it is.

The operations running on the TPM 2.0 when openssl must sign something during the TLS handshake to prove authenticity (the only place where TPM 2.0 is needed, for private key signature operations), are the following in order:

  • ESYS_HASH_ASYNC
  • ESYS_HASH_FINISH
  • ESYS_SIGN_ASYNC
  • ESYS_SIGN_FINISH

The problem with concurrency comes here at different levels:

  1. When using threads: When receiving concurrent requests, the provider messes up, because the context is in not expected state when, for example, the following happens:

Thread 1 - ESYS_HASH_ASYNC
Thread 2 - ESYS_HASH_ASYNC --> Calling Hash_async twice in a row
Thread 1 - ESYS_HASH_FINISH

Here, tss2 library checks that the context status during the thread2 ESYS_HASH_ASYNC call is not the correct and fails with the following error:

ERROR:esys:src/tss2-esys/esys_iutil.c:1253:iesys_check_sequence_async() Esys called in bad sequence.

image

One first solution was to disable the event/worker approach to disable the issues with threads, as it is a recommendation from Apache when using not thread safe modules.

Here, we have 2 problems, as the TPM2 context is inherited by child processes:

-When using the tpm2-abrmd package, as the initialization of the library happens before the forks of the child processes, the unix sockets created to communicate with the dbus daemon are inherited by the children, so, from the point of view of dbus, all the requests come from the same process as it is the same unix socket for all of them.
-When disabling the tpm2-abrmd and relying only on the Kernel RM (tpmrm0), I get the following error:

SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
ERROR:tcti:src/util-io/io.c:114:write_all() failed to write to fd 4: Device or resource busy
ERROR:tcti:src/tss2-tcti/tcti-device.c:124:tcti_device_transmit() wrong number of bytes written. Expected 137, wrote 0.
ERROR:esys:src/tss2-esys/api/Esys_Sign.c:213:Esys_Sign_Async() Finish (Execute Async) ErrorCode (0x000a000a)
ERROR:esys:src/tss2-esys/api/Esys_Sign.c:82:Esys_Sign() Error in async function ErrorCode (0x000a000a)
2024/11/17 16:49:57 [crit] 2716#2716: *4 SSL_do_handshake() failed (SSL: error:4000000F:tpm2::cannot sign:655370 tcti:IO failure error:0A080006:SSL routines::EVP lib) while SSL handshaking, client: 10.81.234.3, server: 0.0.0.0:443

Then, a bunch of different errors occurs.

I did the test myself with a simple c program that initializes the SSL (TPM2.0) context in a main process before forking several child processes, then the processes, listening for incoming TLS connections in 10 different ports, and I have the same errors as above.

The second test is the same, but, instead of initializing the context on the main process, I do so independently on every child. Here, everything works well.

So, my assumption is that Apache2 initializes the TSS context in the parent process, then, the children having all of them the same context is a mess and the context initialization should be done on every child process (threads don't work directly at TSS2 level).

With nginx, it happens exactly the same.

Can you help me understand if my assumption is right? What should the solution be? I tried to do some workarounds with mutexes/semaphores, what it was just worst.

How is it possible that noone before had the same errors than me using Apache2 or nginx?

I post all usefull information:

  • apache2 2.4.52
  • OpenSSL 3.0.13
  • tpm2-provider 1.2.0
  • libtss2 4.1.0
  • tpm2-tools 5.6 -> FYI
    -TPM2 TCG2 spec: 1.64

apache2.service content:

[Unit]
Description=The Apache HTTP Server
After=network.target remote-fs.target nss-lookup.target
Documentation=https://httpd.apache.org/docs/2.4/

[Service]
Type=forking
Environment=APACHE_STARTED_BY_SYSTEMD=true
ExecStart=/usr/sbin/apachectl start
ExecStop=/usr/sbin/apachectl graceful-stop
ExecReload=/usr/sbin/apachectl graceful
KillMode=mixed
PrivateTmp=true
Restart=on-abort

Environment="OPENSSL_CONF=/usr/local/ssl/tpm-openssl.cnf"
Environment="OPENSSL_MODULES=/usr/local/ssl/lib/ossl-modules/"

[Install]
WantedBy=multi-user.target

Content of tpm-openssl.cnf (relevant part):

...
[openssl_init]
alg_section = evp_properties
providers = provider_sect

# List of providers to load
[provider_sect]
tpm2 = tpm2_sect
default = default_sect

[evp_properties]
#default_properties = "?provider=tpm2"
default_properties = "?provider=tpm2,tpm2.rand!=yes,tpm2.digest!=yes"

[tpm2_sect]
activate = 1

[default_sect]
activate = 1
...

@famez
Copy link
Author

famez commented Nov 17, 2024

I add the 2 code snippets for the 2 tests:

  1. Context initialization on the parent process (fails):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <sys/wait.h> // Added for wait()
#include <openssl/ssl.h>
#include <openssl/err.h>

#define START_PORT 8000
#define END_PORT 8010
#define PASS ""

// Password callback for OpenSSL
int password_callback(char *buf, int size, int rwflag, void *userdata) {
    const char *password = (const char *)userdata;
    int len = strlen(password);

    if (len > size) {
        len = size; // Ensure the buffer doesn't overflow
    }

    strncpy(buf, password, len);
    return len;
}

// Initialize OpenSSL context
SSL_CTX* init_openssl() {
    SSL_CTX* ctx;

#if OPENSSL_VERSION_NUMBER < 0x10100000L
    SSL_library_init();
    OpenSSL_add_all_algorithms();
    SSL_load_error_strings();
#else
    OpenSSL_add_ssl_algorithms();
    SSL_load_error_strings();
#endif

    ctx = SSL_CTX_new(TLS_server_method());
    if (!ctx) {
        perror("Unable to create SSL context");
        ERR_print_errors_fp(stderr);
        exit(EXIT_FAILURE);
    }

    // Set password callback
    const char *password = PASS; // Replace with a secure method to retrieve the password
    SSL_CTX_set_default_passwd_cb(ctx, password_callback);
    SSL_CTX_set_default_passwd_cb_userdata(ctx, (void *)password);

    // Load certificate and private key
    if (SSL_CTX_use_certificate_file(ctx, "server_cert.pem", SSL_FILETYPE_PEM) <= 0 ||
        SSL_CTX_use_PrivateKey_file(ctx, "test.pem", SSL_FILETYPE_PEM) <= 0) {
        ERR_print_errors_fp(stderr);
        exit(EXIT_FAILURE);
    }

    return ctx;
}

// Create a socket and bind it to a specific port
int create_server_socket(int port) {
    int server_fd;
    struct sockaddr_in addr;

    server_fd = socket(AF_INET, SOCK_STREAM, 0);
    if (server_fd < 0) {
        perror("Socket creation failed");
        exit(EXIT_FAILURE);
    }

    // Prepare address structure
    memset(&addr, 0, sizeof(addr));
    addr.sin_family = AF_INET;
    addr.sin_port = htons(port);
    addr.sin_addr.s_addr = INADDR_ANY;

    // Bind the socket
    if (bind(server_fd, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
        perror("Bind failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    // Listen for connections
    if (listen(server_fd, 5) < 0) {
        perror("Listen failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    return server_fd;
}

// Handle SSL connections
void handle_connection(SSL_CTX* ctx, int server_fd) {
    struct sockaddr_in client_addr;
    socklen_t client_len = sizeof(client_addr);
    int client_fd;
    SSL* ssl;

    printf("Waiting for connections on port %d...\n", ntohs(client_addr.sin_port));
    while ((client_fd = accept(server_fd, (struct sockaddr*)&client_addr, &client_len)) >= 0) {
        printf("Connection accepted on port %d\n", ntohs(client_addr.sin_port));

        // Create SSL structure and bind to the client socket
        ssl = SSL_new(ctx);
        SSL_set_fd(ssl, client_fd);

        // Perform SSL handshake
        if (SSL_accept(ssl) <= 0) {
            ERR_print_errors_fp(stderr);
        } else {
            printf("SSL connection established\n");
            SSL_write(ssl, "Hello, SSL Client!\n", 20);
        }

        // Clean up
        SSL_shutdown(ssl);
        SSL_free(ssl);
        close(client_fd);
    }

    if (client_fd < 0) {
        perror("Accept failed");
    }
}

int main() {
    SSL_CTX* ctx = init_openssl();
    int port;

    for (port = START_PORT; port <= END_PORT; port++) {
        pid_t pid = fork();

        if (pid == 0) { // Child process
            int server_fd = create_server_socket(port);
            handle_connection(ctx, server_fd);
            close(server_fd);
            exit(0);
        } else if (pid < 0) { // Fork failed
            perror("Fork failed");
            exit(EXIT_FAILURE);
        }
    }

    // Parent process waits for all child processes
    while (wait(NULL) > 0);

    SSL_CTX_free(ctx);
    EVP_cleanup();

    return 0;
}
  1. Code snippet with the initialization in every child process after fork:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <sys/wait.h> // Added for wait()
#include <openssl/ssl.h>
#include <openssl/err.h>

#define START_PORT 8000
#define END_PORT 8010

// Password callback for OpenSSL
int password_callback(char *buf, int size, int rwflag, void *userdata) {
    const char *password = (const char *)userdata;
    int len = strlen(password);

    if (len > size) {
        len = size; // Ensure the buffer doesn't overflow
    }

    strncpy(buf, password, len);
    return len;
}

// Initialize OpenSSL context
SSL_CTX* init_openssl() {
    SSL_CTX* ctx;

#if OPENSSL_VERSION_NUMBER < 0x10100000L
    SSL_library_init();
    OpenSSL_add_all_algorithms();
    SSL_load_error_strings();
#else
    OpenSSL_add_ssl_algorithms();
    SSL_load_error_strings();
#endif

    ctx = SSL_CTX_new(TLS_server_method());
    if (!ctx) {
        perror("Unable to create SSL context");
        ERR_print_errors_fp(stderr);
        exit(EXIT_FAILURE);
    }

    // Set password callback
    const char *password = ""; // Replace with a secure method to retrieve the password
    SSL_CTX_set_default_passwd_cb(ctx, password_callback);
    SSL_CTX_set_default_passwd_cb_userdata(ctx, (void *)password);

    // Load certificate and private key
    if (SSL_CTX_use_certificate_file(ctx, "server_cert.pem", SSL_FILETYPE_PEM) <= 0 ||
        SSL_CTX_use_PrivateKey_file(ctx, "test.pem", SSL_FILETYPE_PEM) <= 0) {
        ERR_print_errors_fp(stderr);
        exit(EXIT_FAILURE);
    }

    return ctx;
}

// Create a socket and bind it to a specific port
int create_server_socket(int port) {
    int server_fd;
    struct sockaddr_in addr;

    server_fd = socket(AF_INET, SOCK_STREAM, 0);
    if (server_fd < 0) {
        perror("Socket creation failed");
        exit(EXIT_FAILURE);
    }

    // Prepare address structure
    memset(&addr, 0, sizeof(addr));
    addr.sin_family = AF_INET;
    addr.sin_port = htons(port);
    addr.sin_addr.s_addr = INADDR_ANY;

    // Bind the socket
    if (bind(server_fd, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
        perror("Bind failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    // Listen for connections
    if (listen(server_fd, 5) < 0) {
        perror("Listen failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    return server_fd;
}

// Handle SSL connections
void handle_connection(SSL_CTX* ctx, int server_fd) {
    struct sockaddr_in client_addr;
    socklen_t client_len = sizeof(client_addr);
    int client_fd;
    SSL* ssl;

    printf("Waiting for connections on port %d...\n", ntohs(client_addr.sin_port));
    while ((client_fd = accept(server_fd, (struct sockaddr*)&client_addr, &client_len)) >= 0) {
        printf("Connection accepted on port %d\n", ntohs(client_addr.sin_port));

        // Create SSL structure and bind to the client socket
        ssl = SSL_new(ctx);
        SSL_set_fd(ssl, client_fd);

        // Perform SSL handshake
        if (SSL_accept(ssl) <= 0) {
            ERR_print_errors_fp(stderr);
        } else {
            printf("SSL connection established\n");
            SSL_write(ssl, "Hello, SSL Client!\n", 20);
        }

        // Clean up
        SSL_shutdown(ssl);
        SSL_free(ssl);
        close(client_fd);
    }

    if (client_fd < 0) {
        perror("Accept failed");
    }
}

int main() {
    int port;

    for (port = START_PORT; port <= END_PORT; port++) {
        pid_t pid = fork();

        if (pid == 0) { // Child process
            // Initialize SSL context inside the child process
            SSL_CTX* ctx = init_openssl();

            // Create server socket for this port
            int server_fd = create_server_socket(port);

            // Handle SSL connections on this port
            handle_connection(ctx, server_fd);

            // Clean up
            close(server_fd);
            SSL_CTX_free(ctx);
            EVP_cleanup();

            exit(0);
        } else if (pid < 0) { // Fork failed
            perror("Fork failed");
            exit(EXIT_FAILURE);
        }
    }

    // Parent process waits for all child processes
    while (wait(NULL) > 0);

    return 0;
}

@famez
Copy link
Author

famez commented Nov 17, 2024

The key and cert are generated with the following commands:

tpm2_createprimary -G ecc -c primary.ctx
tpm2_evictcontrol -c primary.ctx  0x81000010
openssl req -nodes -x509 -subj "/C=$country/CN=$commonName" -keyout server_key.pem -out server_cert.pem
tpm2_import -C primary.ctx -G rsa -i server_key.pem -u test.pub -r test.priv
tpm2_encodeobject -C 0x81000010 -u test.pub -r test.priv -o test.pem

The usecase forces us to import the keys from standard pem format generated by OPENSSL, not create them.

@AndreasFuchsTPM
Copy link
Member

I guess the only 2 solutions would be for either Apache or for tpm2-openssl to implement mutexes.
The purpose and intend of tpm2-tss's contexts is to have 1 context per thread.

Maybe you can add an issue or even a pull request to the tpm2-openssl project.

@famez
Copy link
Author

famez commented Dec 2, 2024

Thanks, I will open an issue in the tpm2-openssl project. In any case, the problems seem to appear also when using a process, then, configure the context, then forking the processes.

@JuergenReppSIT
Copy link
Member

Thanks, I will open an issue in the tpm2-openssl project. In any case, the problems seem to appear also when using a process, then, configure the context, then forking the processes.

So perhaps defining a custom apache2 module acting alongside the default module without modifying its behaviour directly could solve this problem. For this module a child_init_hook could be defined to enable the reloading of the provider after a fork.

@famez
Copy link
Author

famez commented Dec 11, 2024

Hello,

Finally, @Danigaralfo and me came to, what we think, is a stable solution. Semaphores to protect every child of a forked/threaded process (the case for apache/nginx).

Here the patch:
semaphore.patch to be applied to https://github.com/tpm2-software/tpm2-openssl/tree/1.2.0

Before arriving to this solution we tried before several things that I register for the sake of traceability.

First, we tried a third c code snippet to simulate behaviour of apache/nginx but implementing the reopening of the SSL context to check it is works before starting a new module for Apache implementing all this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <sys/wait.h> // Added for wait()
#include <openssl/ssl.h>
#include <openssl/err.h>

#define START_PORT 8000
#define END_PORT 8010
#define PASS ""

// Password callback for OpenSSL
int password_callback(char *buf, int size, int rwflag, void *userdata) {
    const char *password = (const char *)userdata;
    int len = strlen(password);

    if (len > size) {
        len = size; // Ensure the buffer doesn't overflow
    }

    strncpy(buf, password, len);
    return len;
}

// Initialize OpenSSL context
SSL_CTX* init_openssl() {
    SSL_CTX* ctx;

#if OPENSSL_VERSION_NUMBER < 0x10100000L
    SSL_library_init();
    OpenSSL_add_all_algorithms();
    SSL_load_error_strings();
#else
    OpenSSL_add_ssl_algorithms();
    SSL_load_error_strings();
#endif

    ctx = SSL_CTX_new(TLS_server_method());
    if (!ctx) {
        perror("Unable to create SSL context");
        ERR_print_errors_fp(stderr);
        exit(EXIT_FAILURE);
    }

    // Set password callback
    const char *password = PASS; // Replace with a secure method to retrieve the password
    SSL_CTX_set_default_passwd_cb(ctx, password_callback);
    SSL_CTX_set_default_passwd_cb_userdata(ctx, (void *)password);

    // Load certificate and private key
    if (SSL_CTX_use_certificate_file(ctx, "server_cert.pem", SSL_FILETYPE_PEM) <= 0 ||
        SSL_CTX_use_PrivateKey_file(ctx, "test.pem", SSL_FILETYPE_PEM) <= 0) {
        ERR_print_errors_fp(stderr);
        exit(EXIT_FAILURE);
    }

    return ctx;
}

// Create a socket and bind it to a specific port
int create_server_socket(int port) {
    int server_fd;
    struct sockaddr_in addr;

    server_fd = socket(AF_INET, SOCK_STREAM, 0);
    if (server_fd < 0) {
        perror("Socket creation failed");
        exit(EXIT_FAILURE);
    }

    // Prepare address structure
    memset(&addr, 0, sizeof(addr));
    addr.sin_family = AF_INET;
    addr.sin_port = htons(port);
    addr.sin_addr.s_addr = INADDR_ANY;

    // Bind the socket
    if (bind(server_fd, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
        perror("Bind failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    // Listen for connections
    if (listen(server_fd, 5) < 0) {
        perror("Listen failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    return server_fd;
}

// Handle SSL connections
void handle_connection(SSL_CTX* ctx, int server_fd) {
    struct sockaddr_in client_addr;
    socklen_t client_len = sizeof(client_addr);
    int client_fd;
    SSL* ssl;

    printf("Waiting for connections on port %d...\n", ntohs(client_addr.sin_port));
    while ((client_fd = accept(server_fd, (struct sockaddr*)&client_addr, &client_len)) >= 0) {
        printf("Connection accepted on port %d\n", ntohs(client_addr.sin_port));

        // Create SSL structure and bind to the client socket
        ssl = SSL_new(ctx);
        SSL_set_fd(ssl, client_fd);

        // Perform SSL handshake
        if (SSL_accept(ssl) <= 0) {
            ERR_print_errors_fp(stderr);
        } else {
            printf("SSL connection established\n");
            SSL_write(ssl, "Hello, SSL Client!\n", 20);
        }

        // Clean up
        SSL_shutdown(ssl);
        SSL_free(ssl);
        close(client_fd);
    }

    if (client_fd < 0) {
        perror("Accept failed");
    }
}

int main() {
    SSL_CTX* ctx = init_openssl();
    int port;

    for (port = START_PORT; port <= END_PORT; port++) {
        usleep(100000);
        pid_t pid = fork();

        if (pid == 0) { // Child process
	    SSL_CTX_free(ctx);
	    ctx = init_openssl();
            int server_fd = create_server_socket(port);
            handle_connection(ctx, server_fd);
            close(server_fd);
            exit(0);
        } else if (pid < 0) { // Fork failed
            perror("Fork failed");
            exit(EXIT_FAILURE);
        }
    }

    // Parent process waits for all child processes
    while (wait(NULL) > 0);

    SSL_CTX_free(ctx);
    EVP_cleanup();

    return 0;
}

Didn't work, the provider is just initialized once and this does not work as the cprov context is not reinitialized.

Then we tried with mutexes at thread level with the following patches:

signature 1.patch Original patch by @Danigaralfo, then my modification:
good_thread.patch

This worked at thread level, still having problems with apache and nginx when several processes are spawned, not just threads.

Then, we tried modifying the provider to generate several contexts and then every process picking just one different (warning ugly patch just to test if the problems solves with several different contexts):
contexts.patch

There are less errors, but still sometimes failing and totally unfunctional in production as one has to estimate the number of spawned child processes.

Finally, we came up with semphores that protects every thread/process child that is spawned from the parent. At a first moment, we discarded this case, as there were also failures, but I missed the fact that the sem struct must be created in shared memory as the man states (I just skipped that at the beginning).

I hope this will help you and for the moment, we will use this as a temporary solution as the final solution should not necessarily imply the use of mutexes.

Thanks @AndreasFuchsTPM and @JuergenReppSIT for the help.

@famez
Copy link
Author

famez commented Dec 11, 2024

Hello again,

I attached the incorrect patch fixing the problem, in the previous one, still missing the creation of the semaphore in shared memory, this is the correct one:
semaphore.patch

@famez
Copy link
Author

famez commented Dec 11, 2024

This is the code for testing the result. If 1s are printed on the screen, something went wrong, if all 0s, we are ok:

#!/bin/bash

# Number of concurrent connections to open

for port in {8000..8100}; do
        { openssl s_client -connect 10.36.2.5:443 &>/dev/null; echo $?; } &
done


# Wait for all background processes to finish
wait

@gotthardp
Copy link
Contributor

@famez thank you for a detailed analysis of the problem! I implemented a mutex-based solution. Would you be able to test the https://github.com/tpm2-software/tpm2-openssl/tree/threads branch, please?

@famez
Copy link
Author

famez commented Dec 15, 2024

Hi, @gotthardp, unfortunately, I keep having the same errors:

If I configure Apache2 with mpm_worker and the following configuration (only threads):

StartServers            1
MinSpareThreads         25
MaxSpareThreads         75
ThreadLimit             64
ThreadsPerChild         25
MaxRequestWorkers       150
MaxConnectionsPerChild  0

Getting the following from /var/log/apache2/error.log

SIGN DIGEST_INIT rsa MD=SHA256
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
WARNING:tcti:src/tss2-tcti/tcti-device.c:268:tcti_device_receive() Got EOF instead of response.
ERROR:esys:src/tss2-esys/api/Esys_Hash.c:310:Esys_Hash_Finish() Received a non-TPM Error
ERROR:esys:src/tss2-esys/api/Esys_Hash.c:101:Esys_Hash() Esys Finish ErrorCode (0x000a0008)
[Fri Nov 15 23:18:22.379895 2024] [ssl:info] [pid 6554:tid 6556] [client 10.36.2.250:63684] AH02008: SSL library error 1 in handshake (server 127.0.1.1:443)
[Fri Nov 15 23:18:22.379998 2024] [ssl:info] [pid 6554:tid 6556] SSL Library Error: error:4000000E:tpm2::cannot hash (655368 tcti:Fails to connect to next lower layer)
[Fri Nov 15 23:18:22.380052 2024] [ssl:info] [pid 6554:tid 6556] SSL Library Error: error:0A080006:SSL routines::EVP lib
[Fri Nov 15 23:18:22.380083 2024] [ssl:info] [pid 6554:tid 6556] [client 10.36.2.250:63684] AH01998: Connection closed to child 64 with abortive shutdown (server 127.0.1.1:443)
[Fri Nov 15 23:18:22.383641 2024] [ssl:info] [pid 6554:tid 6557] [client 10.36.2.250:63686] AH01964: Connection to child 65 established (server 127.0.1.1:443)
[Fri Nov 15 23:18:22.384507 2024] [ssl:debug] [pid 6554:tid 6557] ssl_engine_kernel.c(2421): [client 10.36.2.250:63686] AH02645: Server name not provided via TLS extension (using default/first virtual host)
SIGN DIGEST_INIT rsa MD=SHA256
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
ERROR:esys:src/tss2-esys/esys_iutil.c:1145:iesys_check_sequence_async() Esys called in bad sequence.
E

When configuring mpm_prefork with config (only forked processes without threads):

StartServers            5
MinSpareServers         5
MaxSpareServers         5
MaxRequestWorkers       5
MaxConnectionsPerChild  0
RSA HAS 2
RSA NEW
RSA IMPORT [ n e ]
RSA MATCH 0x6
RSA HAS 2
RSA MATCH 0x6
DER DECODER DECODE
RSA HAS 2
RSA MATCH 0x6
SIGN DIGEST_INIT rsa MD=SHA256
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
ERROR:tcti:src/util/io.c:114:write_all() failed to write to fd 3: Device or resource busy
ERROR:tcti:src/tss2-tcti/tcti-device.c:124:tcti_device_transmit() wrong number of bytes written. Expected 164, wrote 0.
ERROR:esys:src/tss2-esys/api/Esys_Hash.c:198:Esys_Hash_Async() Finish (Execute Async) ErrorCode (0x000a000a)
ERROR:esys:src/tss2-esys/api/Esys_Hash.c:78:Esys_Hash() Error in async function ErrorCode (0x000a000a)
[Fri Nov 15 23:09:33.994855 2024] [ssl:info] [pid 6239:tid 6284] [client 10.36.2.250:63527] AH02008: SSL library error 1 in handshake (server 127.0.1.1:443)
[Fri Nov 15 23:09:33.994957 2024] [ssl:info] [pid 6239:tid 6284] SSL Library Error: error:4000000E:tpm2::cannot hash (655370 tcti:IO failure)
[Fri Nov 15 23:09:33.995015 2024] [ssl:info] [pid 6239:tid 6284] SSL Library Error: error:0A080006:SSL routines::EVP lib
[Fri Nov 15 23:09:33.995048 2024] [ssl:info] [pid 6239:tid 6284] [client 10.36.2.250:63527] AH01998: Connection closed to child 195 with abortive shutdown (server 127.0.1.1:443)
[Fri Nov 15 23:09:33.997254 2024] [ssl:info] [pid 6239:tid 6274] [client 10.36.2.250:63530] AH01964: Connection to child 192 established (server 127.0.1.1:443)
[Fri Nov 15 23:09:33.998154 2024] [ssl:debug] [pid 6239:tid 6274] ssl_engine_kernel.c(2421): [client 10.36.2.250:63530] AH02645: Server name not provided via TLS extension (using default/first virtual host)
SIGN DIGEST_INIT rsa MD=SHA256
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
ERROR:esys:src/tss2-esys/esys_iutil.c:1145:iesys_check_sequence_async() Esys called in bad sequence.
ERROR:esys:src/tss2-esys/api/Esys_Hash.c:78:Esys_Hash() Error in async function ErrorCode (0x00070007)
[Fri Nov 15 23:09:34.027560 2024] [ssl:info] [pid 6239:tid 6274] [client 10.36.2.250:63530] AH02008: SSL library error 1 in handshake (server 127.0.1.1:443)
[Fri Nov 15 23:09:34.027646 2024] [ssl:info] [pid 6239:tid 6274] SSL Library Error: error:4000000E:tpm2::cannot hash (458759 esapi:Function called in the wrong order)
[Fri Nov 15 23:09:34.027697 2024] [ssl:info] [pid 6239:tid 6274] SSL Library Error: error:0A080006:SSL routines::EVP lib
[Fri Nov 15 23:09:34.027727 2024] [ssl:info] [pid 6239:tid 6274] [client 10.36.2.250:63530] AH01998: Connection closed to child 192 with abortive shutdown (server 127.0.1.1:443)
[Fri Nov 15 23:09:34.333450 2024] [ssl:info] [pid 6268:tid 6302] [client 10.36.2.250:63528] AH02008: SSL library error 1 in handshake (server 127.0.1.1:443)
[F

I reviewed the patch and I think that there are 2 points here (it is explained above, but I do it again):

  1. The problem is not only related to threads, but with forked processes (not threads involved). With mpm_preforked, Apache will initialize all the Openssl stuff on the parent, then, the children inheriting everything will make a mess on the broker access (abrmd or tprm) and that is not covered on your code.
  2. It seems that making the whole signature operation atomic, it works:
SIGN DIGEST_INIT rsa MD=SHA256
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA256
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA256
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA256
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA256
Sem unlocking
Sem locked
SIGN DIGEST_INIT rsa MD=SHA2-256
SIGN SET_CTX_PARAMS rsa [ pad-mode ]
SIGN SET_CTX_PARAMS rsa [ saltlen ]
SIGN DIGEST_SIGN estimate
SIGN DIGEST_SIGN
Sem unlocking
Sem locked

Please, consider trying to reproduce the bug and applying the patch that we developed, so maybe this can give you more insights: https://github.com/user-attachments/files/18098206/semaphore.patch

@famez
Copy link
Author

famez commented Dec 15, 2024

The easiest way to reproduce the problem is to configure Apache2 with the provider, then opening several windows from Google Chrome, if you are lucky, you will have the bug in the moment, otherwise, open more windows in incognito mode, an refresh them in a interleaved way.

You can also play with the mpm_prefork, mpm_worker modules and configuration.

@famez
Copy link
Author

famez commented Dec 15, 2024

I undertand that the repo is OS agnostic, so using POSIX semaphores is not the final solution, but I just tested in a RPI with real physical TPM (implementing TCG revision 1.16) and in a proxmox VM (with virtual TPM complying with TCG revision 1.64) and both working with no failures with the semaphores patch approach.

@famez
Copy link
Author

famez commented Dec 15, 2024

For a reason I don't know right know, seems that these operations must be atomic:

  • ESYS_HASH_ASYNC
  • ESYS_HASH_FINISH
  • ESYS_SIGN_ASYNC
  • ESYS_SIGN_FINISH

Which is achieved, locking a semaphore at signature context init function and unlocking in signature context free function.

The semaphore create on the parent process is created in a way that is inherited by every child thread and process:

sem = mmap(NULL, sizeof(sem_t), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);

@famez
Copy link
Author

famez commented Dec 15, 2024

Hi again,

I did further testing, seems that when spawning only threads, the problem is solved with https://github.com/tpm2-software/tpm2-openssl/tree/threads.

I tested with the following configuration in apache:

# event MPM
# StartServers: initial number of server processes to start
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestWorkers: maximum number of worker threads
# MaxConnectionsPerChild: maximum number of requests a server process serves
StartServers            1
MinSpareThreads         50
ServerLimit              1
MaxSpareThreads         50
ThreadLimit             50
ThreadsPerChild         50
MaxRequestWorkers       150
MaxConnectionsPerChild  0

It is working.

With the previous configuration, when apache2 was overloaded, the server forked some new processes.

My apoligies for the previous bad testing.

However, it is still missing the approach for solving the forked processes inheriting the context from parent where CRYPTO_RW_LOCKS from Openssl are not sufficient, cause they are based on thread mutex, which is not valid for processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants