Evaluating recurrent neural networks over encrypted data using NumPy and Concrete

Introducing the public release of HNP by Zama.

Oct 13, 2021

Fully Homomorphic Encryption (FHE) is a cryptographic technique that allows you to compute on ciphertexts (encrypted messages) without needing to decrypt the messages inside them. FHE programming is notoriously hard, which is why we created an experimental compiler that can convert a classical Numpy program into an FHE circuit that can then be run using the Concrete FHE library.

Homomorphic NumPy

The Homomorphic NumPy (HNP) library allows you to convert functions operating on NumPy multidimensional arrays into a homomorphic equivalent. Think of it as writing your computation in the usual way using NumPy only, then HNP will take care of the conversion.

In this article and in the accompanying video, we will showcase two examples using HNP. We will start with logistic regression to get familiar with the process of converting NumPy functions, then go for a more involved use case, Recurrent Neural Networks.

Disclaimer: HNP is an experimental tool that will not be further developed, as we are working on a new compiler with better performances and reliability. We still wanted to show how far FHE has gotten and enable the community to experiment while we work on the stable release.

Installing HNP

Before we can get into the examples, we need to install HNP using the Zama docker image. The container comes with the necessary libraries preinstalled, including Jupyter so that you can directly start playing with it.

# Pull the docker image
docker pull docker.io/zamafhe/hnp

# Start a Jupyter notebook
docker run — rm -it -p 8888:8888 -v /src/path:/data zamafhe/hnp

You’re ready to go!

Homomorphic Logistic Regression

Let’s start with a simple example of how to use HNP: performing inference using a logistic regression model.

The first part is to import the libraries and define the inference function. This is pretty straightforward since we assume the model is already trained

        
import hnumpy as hnp
import numpy as np

weights = np.array([0.1, 0.2, 0.3, 0.4, 0.5])
bias = np.array([0.1])

def sigmoid(x):
    return 1 / (1 + np.exp(-x))
  
def func(x):
    return sigmoid(np.dot(x, weights) + bias)

view raw logreg_setup.py hosted with ❤ by GitHub

To compile our NumPy function into its homomorphic equivalent, we need to provide some information about the inputs, namely, the shape of the multi-dimensional array, and its bounds. The bounds are the range in which the values of the input array fall. It’s important to note that these bounds should only take into account the input, and not any computation that might occur later on (this will be taken care of by the compiler).

        
h = hnp.compile_fhe(
  func,
  {'x': hnp.encrypted_ndarray(bounds=(-1, 1), shape=(5,))}
)

view raw logreg_compilation.py hosted with ❤ by GitHub

FHE is currently limited in terms of precision, which means bounds have to be as tight as possible. Here, we will generate some random data between -1 and 1, and check that it will run correctly when encrypted using the simulate method. The result can be a little different between simulation and the original NumPy computation, but as far as it doesn’t exceed h.expected_precision() then it’s considered valid.

        
x = np.random.uniform(-1, 1, 5)

print(f"Simulation result: {h.simulate(x)}")
print(f"Plain NumPy result: {func(x)}")

view raw logreg_simulate.py hosted with ❤ by GitHub

Next, we need to generate public and private keys for the user. In FHE, the server doing the computation doesn’t need the private key since nothing is decrypted. Instead, a public key is sent for each user of the service. Note that the compilation itself is user-independent, so you only need to compile once and it will run for any public key and user of your system. Key generation currently takes a while (tens of seconds, sometimes minutes), but it only needs to be done once.

        
ctx = h.create_context()
keys = ctx.keygen()

view raw keygen_log_reg.py hosted with ❤ by GitHub

Finally, we can run the computation. This consists of three steps: encryption of the input, evaluation of the program, and decryption of the output. In a real application, encryption and decryption is done on the user’s device, while the evaluation is done server side.

        
x_enc= keys.encrypt(x)
res = h.run(keys.public_keys, x_enc)
print(f"Encrypted computation result: {keys.decrypt(res)}")

view raw run_encrypted_log_reg.py hosted with ❤ by GitHub

For convenience when debugging, HNP also provides a shortcut to do all the steps at once: h.encrypt_and_run(keys, x)

That’s it! You have now successfully created your first homomorphic NumPy program. A more complete Logistic Regression example can be found here.

Recurrent Neural Networks

In this example, we will use a simple LSTM (long short-term memory) to do simple sentiment analysis and classify a sentence as either positive or negative.

Deep learning, and in particular RNNs, are notoriously hard to implement using FHE, as it used to be impossible to evaluate non-linear activation functions homomorphically, as well as impossible to go beyond a few layers deep because of noise accumulation in the ciphertext. Both of these issues are solved in Concrete by implementing a novel operator called “programmable bootstrapping”, which HNP relies upon heavily.

Our model is an LSTM followed by a linear layer and a sigmoid activation function. We use a pre-trained word embedding and this dataset.

For this example, we will need some additional boilerplate code, as we will be using PyTorch, but the overall compilation process remains the same. The complete notebook for this example can be found here, so we will only focus on the important parts.

First, let’s define the model in PyTorch:

        
HIDDEN_SIZE = 100

class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.lstm = torch.nn.LSTM(input_size=300, hidden_size=HIDDEN_SIZE)
        self.fc = torch.nn.Linear(HIDDEN_SIZE, 1)
        self.sigmoid = torch.nn.Sigmoid()

    def forward(self, x):
        _, (x, _) = self.lstm(x)
        x = self.fc(x)
        return self.sigmoid(x)

view raw rnn_model.py hosted with ❤ by GitHub

We will assume at this point that we have the trained model and want to compile it into its homomorphic equivalent. Our compile function can only take NumPy computation, so we will need to manually convert this PyTorch model to work with NumPy. Here is how to extract the learned parameters and implement the forward pass using NumPy:

        
class Inferer:
    def __init__(self, model):
        parameters = list(model.lstm.parameters())
        
        W_ii, W_if, W_ig, W_io = parameters[0].split(HIDDEN_SIZE)
        W_hi, W_hf, W_hg, W_ho = parameters[1].split(HIDDEN_SIZE)
        b_ii, b_if, b_ig, b_io = parameters[2].split(HIDDEN_SIZE)
        b_hi, b_hf, b_hg, b_ho = parameters[3].split(HIDDEN_SIZE)
        
        self.W_ii = W_ii.detach().numpy()
        self.b_ii = b_ii.detach().numpy()
        
        self.W_hi = W_hi.detach().numpy()
        self.b_hi = b_hi.detach().numpy()
        
        self.W_if = W_if.detach().numpy()
        self.b_if = b_if.detach().numpy()
        
        self.W_hf = W_hf.detach().numpy()
        self.b_hf = b_hf.detach().numpy()
        
        self.W_ig = W_ig.detach().numpy()
        self.b_ig = b_ig.detach().numpy()
        
        self.W_hg = W_hg.detach().numpy()
        self.b_hg = b_hg.detach().numpy()
        
        self.W_io = W_io.detach().numpy()
        self.b_io = b_io.detach().numpy()
        
        self.W_ho = W_ho.detach().numpy()
        self.b_ho = b_ho.detach().numpy()
        
        self.W = model.fc.weight.detach().numpy().T
        self.b = model.fc.bias.detach().numpy()
        
    def infer(self, x):
        x_t, h_t, c_t = None, np.zeros(HIDDEN_SIZE), np.zeros(HIDDEN_SIZE)
        for i in range(x.shape[0]):
            x_t = x[i]
            _, h_t, c_t = self.lstm_cell(x_t, h_t, c_t)
        r = np.dot(h_t, self.W) + self.b
        return self.sigmoid(r)
    
    def lstm_cell(self, x_t, h_tm1, c_tm1):
        i_t = self.sigmoid(
            np.dot(self.W_ii, x_t) + self.b_ii + np.dot(self.W_hi, h_tm1) + self.b_hi
        )
        f_t = self.sigmoid(
            np.dot(self.W_if, x_t) + self.b_if + np.dot(self.W_hf, h_tm1) + self.b_hf
        )
        g_t = np.tanh(
            np.dot(self.W_ig, x_t) + self.b_ig + np.dot(self.W_hg, h_tm1) + self.b_hg
        )
        o_t = self.sigmoid(
            np.dot(self.W_io, x_t) + self.b_io + np.dot(self.W_ho, h_tm1) + self.b_ho
        )
        c_t = f_t * c_tm1 + i_t * g_t
        h_t = o_t * np.tanh(c_t)
        return o_t, h_t, c_t
    
    @staticmethod
    def sigmoid(x):
        return 1 / (1 + np.exp(-x))

view raw rnn_model_numpy.py hosted with ❤ by GitHub

Now that we have our forward pass in pure NumPy, we can compile it. We will be using some advanced configuration options to make things better:

The handselected parameter optimizer will use a pre-computed set of parameters that have been known to work well in machine learning usecases by sacrificing some precision for faster execution.
The apply_topological_optimization parameter should be enabled by default to ensure the FHE circuit is correctly optimized.
The probabilistic_bounds parameter controls how big the margin of error can be around the bounds of data; a bigger value will guarantee a bigger margin of error, but less precision.

You can play with these parameters and see how they affect the final result. Here, we limit the length of sentence to five words in order to keep the running time reasonable, but feel free to use longer sentences.

        
SENTENCE_LENGTH_LIMIT = 5

inferer = Inferer(model)

homomorphic_inferer = hnp.compile_fhe(
    inferer.infer,
    {
        "x": hnp.encrypted_ndarray(bounds=(-1, 1), shape=(SENTENCE_LENGTH_LIMIT, 300))
    },
    config=hnp.config.CompilationConfig(
        parameter_optimizer="handselected",
        apply_topological_optimizations=True,
        probabilistic_bounds=6,
    ),
)

view raw compile_rnn.py hosted with ❤ by GitHub

Then we generate some user keys:

        
context = homomorphic_inferer.create_context()
keys = context.keygen()

view raw keygen_rnn.py hosted with ❤ by GitHub

We are now ready to run the evaluation. To verify that the compilation went well, we will also output some debugging info using the simulate function. Note that sentences of less than five words will need to be padded with zeros.

        
def evaluate(sentence):
    try:
        embedded = encode(sentence)
    except KeyError as error:
        print("! the word", error, "is unknown")
        return
    if embedded.shape[0] > SENTENCE_LENGTH_LIMIT:
        print(f"! the sentence should not contain more than {SENTENCE_LENGTH_LIMIT} tokens")
        return
    padded = np.zeros((SENTENCE_LENGTH_LIMIT, 300))
    padded[SENTENCE_LENGTH_LIMIT - embedded.shape[0]:, :] = embedded
    original = model(torch.tensor(padded.reshape((-1, 1, 300))).float()).detach().numpy()[0, 0, 0]
    simulated = homomorphic_inferer.simulate(padded)[0]
    start = timeit.default_timer()
    actual = homomorphic_inferer.encrypt_and_run(keys, padded)[0]
    end = timeit.default_timer()
    if actual < 0.35:
        print("- the sentence was negative", end=' ')
    elif actual > 0.65:
        print("+ the sentence was positive", end=' ')
    else:
        print("~ the sentence was neutral", end=' ')
    print(
        f"("
        f"original: {original * 100:.2f}%, "
        f"simulated: {simulated * 100:.2f}%, "
        f"actual: {actual * 100:.2f}%, "
        f"difference: {np.abs(original - actual) * 100:.2f}%, "
        f"took: {end - start:.3f} seconds"
        f")"
    )

view raw evaluate_rnn.py hosted with ❤ by GitHub

And finally, we evaluate an example:

        
evaluate("Encryption is awesome")
+ the sentence was positive (original: 99.99%, simulated: 99.99%, actual: 99.99%, difference: 0.00%, took: 33.813 seconds)

view raw evaluate_example_rnn.py hosted with ❤ by GitHub

This might take 30 seconds up to 20 minutes or more, depending on your hardware, so the more cores you have on your CPU the better!

Conclusion

FHE is still in its infancy, and until recently was not even working at all. While the precision and speed is still a barrier to adoption, they are improving following a Moore-like law where we have a 10x gain in speed every 18 months or so. This means that by 2025, FHE should be usable everywhere on the internet, from databases to machine learning and analytics!

Let us know what you build!

Zama's newsletter