Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Security Bounds for Masking Schemes in the Noisy Leakage Model, Summaries of Cryptography and System Security

The security bounds for masking schemes in the noisy leakage model, focusing on the work of prouff & rivain, dziembowski et al., and prest et al. The document compares the incentives towards small δ, recent improvements, and extensions to chosen plaintext attacks. It also introduces the model of noisy leaking computation and the reduction to uniform secrets for binary gates.

Typology: Summaries

2022/2023

Uploaded on 01/09/2024

mahmoud-el-khalil
mahmoud-el-khalil 🇨🇦

1 document

1 / 35

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Prouff & Rivain’s Formal Security Proof of
Masking, Revisited
Tight Bounds in the Noisy Leakage Model
Loïc Masure and François-Xavier Standaert
ICTEAM Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium
loic.masure@uclouvain.be
Abstract.
Masking is a counter-measure that can be incorporated to
software and hardware implementations of block ciphers to provably
secure them against side-channel attacks. The security of masking can be
proven in different types of threat models. In this paper, we are interested
in directly proving the security in the most realistic threat model, the
so-called noisy leakage adversary, that captures well how real-world side-
channel adversaries operate. Direct proofs in this leakage model have
been established by Prouff & Rivain at Eurocrypt 2013, Dziembowski
et al. at Eurocrypt 2015, and Prest et al. at Crypto 2019. These
proofs are complementary to each other, in the sense that the weaknesses
of one proof are fixed in at least one of the others, and conversely. These
weaknesses concerned in particular the strong requirements on the noise
level and the security parameter to get meaningful security bounds, and
some requirements on the type of adversary covered by the proof i.e.,
chosen or random plaintexts. This suggested that the drawbacks of each
security bound could actually be proof artifacts. In this paper, we solve
these issues, by revisiting Prouff & Rivain’s approach.
1 Introduction
1.1 Context
Side-Chanel Analysis (SCA) represents an important threat for cryptographic
implementations on embedded devices such as smart-cards, Micro-Controller
Units (MCUs), etc. [
35
,
36
]. In such attacks, the adversary has a physical access
to the target device. More precisely, the adversary is assumed to measure some
physical metrics of the device called leakages e.g. the power consumption
of the device or the Electro-Magnetic (EM) emanations around the target
during one or several encryptions. It is then possible to use this side information
beside leveraging plaintexts and ciphertexts to guess the values of sensitive
variables, i.e. the values of intermediate calculations depending on some chunks
of secret. This way, an SCA adversary may independently recover the secret in a
divide-and-conquer approach, making the typical complexity of such attacks often
negligible compared to a regular cryptanalysis. That is why the SCA threat should
carefully be taken into account in the design of cryptographic implementations.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23

Partial preview of the text

Download Security Bounds for Masking Schemes in the Noisy Leakage Model and more Summaries Cryptography and System Security in PDF only on Docsity!

Prouff & Rivain’s Formal Security Proof of

Masking, Revisited

Tight Bounds in the Noisy Leakage Model

Loïc Masure and François-Xavier Standaert

ICTEAM Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium loic.masure@uclouvain.be

Abstract. Masking is a counter-measure that can be incorporated to software and hardware implementations of block ciphers to provably secure them against side-channel attacks. The security of masking can be proven in different types of threat models. In this paper, we are interested in directly proving the security in the most realistic threat model, the so-called noisy leakage adversary, that captures well how real-world side- channel adversaries operate. Direct proofs in this leakage model have been established by Prouff & Rivain at Eurocrypt 2013, Dziembowski et al. at Eurocrypt 2015, and Prest et al. at Crypto 2019. These proofs are complementary to each other, in the sense that the weaknesses of one proof are fixed in at least one of the others, and conversely. These weaknesses concerned in particular the strong requirements on the noise level and the security parameter to get meaningful security bounds, and some requirements on the type of adversary covered by the proof — i.e., chosen or random plaintexts. This suggested that the drawbacks of each security bound could actually be proof artifacts. In this paper, we solve these issues, by revisiting Prouff & Rivain’s approach.

1 Introduction

1.1 Context

Side-Chanel Analysis (SCA) represents an important threat for cryptographic implementations on embedded devices such as smart-cards, Micro-Controller Units (MCUs), etc. [ 35 , 36 ]. In such attacks, the adversary has a physical access to the target device. More precisely, the adversary is assumed to measure some physical metrics of the device called leakages — e.g. the power consumption of the device or the Electro-Magnetic (EM) emanations around the target — during one or several encryptions. It is then possible to use this side information — beside leveraging plaintexts and ciphertexts — to guess the values of sensitive variables, i.e. the values of intermediate calculations depending on some chunks of secret. This way, an SCA adversary may independently recover the secret in a divide-and-conquer approach, making the typical complexity of such attacks often negligible compared to a regular cryptanalysis. That is why the SCA threat should carefully be taken into account in the design of cryptographic implementations.

Thankfully, this does not prevent the deployment and the use of embedded cryptography, as this threat can be mitigated by incorporating counter-measures in the implementation. At a very high level, most of the counter-measures such as masking or shuffling turn a deterministic cryptographic primitive into a non- deterministic implementation by injecting some randomness during the execution of the primitive, either at a physical level or at an algorithmic level. In this paper, we focus on the main counter-measure considered so far in SCA, namely masking [ 27 , 16 ], a.k.a. “Multi-Party Computation (MPC) on silicon” [ 32 ]. In a nutshell, any sensitive variable is submitted to a (d + 1)-linear secret-sharing, where d is the security parameter that the designer may control in order to achieve the desired security level. The implementation is then modified in a way such that all the subsequent calculations involving a sensitive variable are now replaced by some gadgets operating on the shares separately, as in multi-party computation. As a result, any SCA adversary must have access to the noisy observation of every share of secret to be able to recover any piece of information about a sensitive variable. If any noisy observation induces some uncertainty on the actual value of the corresponding share, it results in an amplified uncertainty on the actual value of the target sensitive variable — an intuition that dates back to the seminal works of Chari et al. at Crypto 99 [ 16 ]. As a consequence, the complexity of any SCA attack increases exponentially fast with the security parameter d, at the price of quadratic (or super-linear) runtime and memory overheads in the implementation only [32].

1.2 Provable Security of Masking

The latter intuition has been formalized over the past few years by masking security proofs. Generally speaking, a masking security proof takes as inputs an abstract representation of the implementation, the number of shares d + 1 (where d act as he security parameter) and a measure of the noisiness of the leakage, usually characterized from the device embedding the implementation. The masking security proof then returns an upper bound on a metric depicting the security level of the implementation. There exists different strategies to establish a masking security proof. In this paper, we focus on masking security bounds directly stated in the most realistic threat model. This approach has been first considered by Chari et al. [ 16 ], before being formalized by Prouff and Rivain [ 45 ]. Concretely, a noisy observation of an intermediate calculation is a Probability Mass Function (p.m.f.) over all the hypothetical values that the operands may take: the closer the p.m.f. to the uniform distribution, the noisier the leakage. The idea of security proofs in the noisy leakage model is to assume that any noisy leakage accessed by the adversary is δ-close to the uniform distribution, for some real-valued parameter δ stated in a metric that can be measured by the practitioner.^1 Then, the goal is to prove that the p.m.f. of the secret key, given an (^1) e.g., the Statistical Distance (SD), the Euclidean Norm (EN), or the Mutual Infor- mation (MI). Notice that in our context, “noisier” means a lower δ.

devices by evaluators, and hereupon the RE may not be efficiently tractable — especially for high-dimensional leakage — nor even be formally defined in some cases. As an example, Prest et al. even needed to use tedious tail-cut arguments on the exemplary leakage distributions of their case study [ 44 , Remark 2].

  1. Random message attacks [45]. Last but not least, Prouff and Rivain’s security bounds are given for random message attacks, whereas Dziembowski et al. and Prest et al. state security bounds for chosen plaintext attacks. Even if most of state-of-the-art SCA adversaries consider random plaintext attacks, this contrasts with the common practice in cryptography, where the adversary is assumed to (adaptively) choose the message or the ciphertext.

Table 1: Comparison between all proofs in the Noisy Leakage model: Prouff & Rivain [ 45 ], Dziembowski et al. [25], Prest et al. [44]. Feature [45] [25] [44] Our work Strong noise requirement Yes No No No Leak-free refreshing Yes Yes Yes (Sec. 6) Yes Incentive to small δ 3 7 3 3 Average-case metric 3 3 7 3 Adaptive attacks 7 3 3 3

1.3 Recent Improvements on Security Bounds for Encodings Only

In light of the previous drawbacks listed so far, Duc et al. conjectured at Euro- crypt 2015 that the weaknesses (1-3) were actually proof artifacts [ 24 ]. More precisely, it would be possible to prove a masking security bound in terms of MI with tight noise requirement, and tight amplification rates, while covering the leakage of the full block cipher. In a recent line of works, Ito et al. [33], Masure et al. [ 40 ], and Béguinot et al. [ 14 ] have been able to prove a reduced version of Duc et al.’s conjectured security bound, for the leakage of one encoding only. While these works represent a first milestone, they were limited in that they did not cover the leakage coming from the computations, and Duc et al.’s conjecture remained to be proven for the leakage of a full block cipher.

1.4 Our Contribution

In this paper, we prove new masking security bounds stated in the noisy leakage model, in the same setting as the one of the previous works discussed so far — namely Rivain-Prouff’s masking scheme, with leak-free refreshings [ 45 ]. To this end, we revisit Prouff and Rivain’s approach, by showing that some drawbacks of their results can be circumvented.

  • A tight bound with respect to the noise parameter δ. We leverage the recent results of Ito et al. [ 33 ], Masure et al. [ 40 ] and Béguinot et al. [ 14 ], to bound the amount of informative leakage of computations coming from a full block cipher, masked with an I.S.W.-like masking scheme. As a result, our noise requirement is tight [ 31 ], while carrying a much higher incentive to noisier leakage than in the previous works.
  • A security bound with low dependency on the field size. With the previous contribution alone, our final security bound would still carry a constant factor scaling quadratically with the size of the field over which the block cipher operates, regardless of the number of shares. While this is much better than Prouff & Rivain’s bound and competitive with Dziembowski et al.’s bound, this still sounds unnatural, as it does not perfectly fit Duc et al.’s conjecture [ 24 ], and might be fatal for block ciphers operating over large fields. To tackle this problem, we show how a careful scrutiny of the implementation, under mild assumptions on the Sbox, can allow us to make this constant factor quasi-linear with the field size. We even show how this constant factor overhead can further be made almost independent of the field size, by combining the Rivain-Prouff masking scheme with blinding, a well-known counter-measure in asymmetric cryptography.
  • Security Bound with Average Metric. In our masking security proof, any metric, be it the baseline noise δ or the final security bound , is expressed in MI. This contrasts with Prouff & Rivain’s work where the parameters δ and  are not expressed in the same metric. Since MI is an averaged metric, it is quite easy to estimate by evaluators when characterizing the behavior of the target device in worst-case evaluations [4].
  • Attacks with Chosen Messages. Eventually, we argue how our security bounds stated for random plaintext attacks can be extended to the case of chosen plaintext attacks, using a similar argument as the one stated by Dziembowski et al. in their follow-up work at Tcc 2016 [26].

Overall, our work is the first to state a masking proof with meaningful security bounds, i.e., for which the desired security level can be reached with a reasonable amount of masking shares, and requiring a reasonable amount of noise from the device. Therefore, our masking security bound can be practically used by an SCA evaluator to upper bound current state-of-the-art SCA adversaries. This suggests that masking proofs directly stated in the noisy leakage model can be seen as complementary to the more generic proofs in other threat models. The only shortcoming of our proof, in line of the previous works, concerns the use of leak-free refreshings. We hope future works may allow to relax this assumption, and thereby provide a comparable setting with masking security proofs in the indirect approach taking advantage of reductions between models.

2 Preliminaries

In this paper, we denote sets by calligraphic letters, e.g., X. In particular, the letter Y denotes a finite field (Y, ⊕, ×) of characteristic two. Upper-case letters

induced by each elementary calculation of a block cipher, and that returns a guess K̂ of one chunk K ∈ Y of the secret key K. We say that the adversary is random-plaintext if P is chosen randomly and uniformly over YNa^ , whereas we say that the adversary is chosen-plaintext if the adversary can arbitrarily choose the sequence P — possibly adaptively.

Notice that K̂ depends on the plaintexts used by the adversary (and on the internal randomness of the leakage functions). Accordingly, the accuracy of the key guessing is expected to increase with the number Na of queries. We formalize this in the definition hereafter.

Definition 2 (Success Rate). The success rate of an SCA key recovery ad- versary is the quantity

SR(Na) = Pr

K = K

Similarly, for any probability threshold (^) |Y|^1 ≤ β ≤ 1 , we define the efficiency

N (^) a? (β) of an SCA key recovery adversary as the minimal amount of queries necessary to get a success rate higher than β.

MI-Noisy Leakage. The success of an SCA key recovery adversary depends on how informative the leakage is about the underlying secret data processed. To measure this, we assume that the evaluator may determine how noisy any leakage function is. To this end, we formally define hereafter the concept of MI-noisy leakage.

Definition 3 (Noisy leakage for unary gates). Let C : Y → Y be an elemen- tary calculation associated with the leakage function L. L is said to be δ-MI-noisy, for some δ ≥ 0 , if for any input random variable A of C, uniformly distributed over Y,

MI(A; L(A)) ≤ δ.

Definition 4 (Noisy leakage for binary gates). Let C : Y^2 → Y be an elementary calculation associated with the leakage function L. L is said to be δ- MI-noisy, for some δ ≥ 0 , if for any input random variables A, B of C, uniformly distributed over Y,

MI(A, B; L(A, B)) ≤ δ.

We chose the MI as a metric of reference in our proof, because it is at the core of Prouff & Rivain’s security bound that we revisit in this paper, and also because we can therefore rely on the recent improvement of Ito et al. [ 33 ], Masure et al. [ 40 ] and Béguinot et al. [ 14 ]. Moreover, the MI is known to be tightly linked to the complexity of Differential Power Analysis (DPA) attacks [ 37 , 38 , 39 , 22 , 17 ], and “generally carries more intuition (see, e.g., [ 5 ] in the context of linear cryptanalysis)” [ 24 ]. We discuss this choice of metric in section 5.

2.2 Rivain-Prouff ’s Masking Scheme

We recall hereafter the definition of masking, mostly taken from Prouff and Rivain’s paper [45, Def. 2]. Definition 5. Let d be a positive integer. The d-encoding of Y ∈ Y is a (d + 1)- tuple (Yi) 0 ≤i≤d satisfying

⊕d i=0 Yi^ =^ Y^ and such that for any strict subset^ I^ of J 0 , dK, (Yi)I is uniformly distributed over Y|I|.

The parameter d in Definition 5 refers here to the security parameter of the counter-measure. In their paper, Prouff and Rivain explain how to turn any block cipher into a d-order secure implementation — i.e. such that any intermediate computation depending on a secret has a (d + 1)-encoding [ 45 ]. First, the plaintext and the secret key are split into d + 1 shares. Then, each elementary calculation of the block cipher is transformed as follows. If the elementary calculation is linear with respect to its inputs, then it is replaced by the sequence of elementary calculations listed in Algorithm 1. If the elementary calculation

Algorithm 1 Linear gadget in Prouff & Rivain’s proof.

Require: A: (d + 1)-sharing of A, C: elementary calculation linear with its input. Ensure: B : (d + 1)-sharing of C(A). 1: for i = 0,... , d do 2: Bi ← C(Ai). Type 1 or 2 3: end for 4: B ← Refresh(B). Assumed to be leak-free 5: A ← Refresh(A). Only if A used subsequently.

is an Sbox, then it can first be decomposed as a sequence of linear calculations and field multiplications. Then the linear calculations can be processed as in Algorithm 1, and the field multiplications can be replaced by the procedure listed in Algorithm 2. It is a variant of the actual I.S.W. scheme revisited by Rivain and Prouff at Ches 2010, up to a permutation between independent operations, so it does not change the amount of informative leakage. Overall, Rivain-Prouff’s masked implementation can be decomposed as subsequences of any of the following types:

  1. (zi ← g(xi)) 0 ≤i≤d, with g being a linear function (of the block-cipher);
  2. (zi ← g(xi)) 0 ≤i≤d, with g being an affine function (within an Sbox evalua- tion);
  3. (vi,j ← ai × bj ) 0 ≤i,j≤d (cross-products computation step in multiplication);
  4. (ti,j ← ti,j− 1 ⊕ vi,j ) 0 ≤i,j≤d (compression step multiplication).

For concreteness, we list two examples of schemes of the AES Sbox (at least its non-linear part) with this method in Algorithms 3 and 4. Algorithm 3 is the one initially proposed by Rivain and Prouff at Ches 2010. Recently, Cardoso et al. proposed at Cardis 2022 an alternative exponentiation scheme depicted

authors make an intermediate reduction to the case where every elementary computation processes uniform secrets — and mutually independent as well, in the case of binary gates. Finally, the authors apply some noise amplification lemma from the literature. Our revisited proof applies the same outline. We now dig into the details of these steps.

3.1 Step 1: Decomposition into Subsequences

We first recall that the MI of a sequence of mutually independent leakages can be bounded by the sum of MIs of each leakage. Theorem 1 (Subsequence decomposition [45]). Let Y be a random variable over a finite set Y, not necessarily uniform. Let L = (L 1 ,... , Lt) be t random variables such that the random variables (Li | Y = y)i are mutually independent for every y ∈ Y. Then, we have

MI(Y; L) ≤

∑^ t

i=

MI(Y; Li). (2)

Although we do not claim any improvement in this first step, we reproduce the proof in section B for completeness.

3.2 Step 2(a): Reduction to Uniform Secrets for Unary Gates

We now revisit the second step of Prouff and Rivain’s work, namely the reduction from non-uniform secrets to uniform secrets. To this end, we will split our results into two cases. The first case processed in this subsection deals with non-uniform inputs of unary calculations, such as Line 4 in Algorithm 3. The second case deals with non-uniform and non-independent inputs of binary calculations, such as Line 6 in Algorithm 3, and will be deferred in subsection 3.3. The results presented in this section aim at bounding the MI between C(Y), where C : Y → Y and its corresponding leakage. We first state the following theorem that relies on a technical lemma from Shulman and Feder [49]. Theorem 2 (Generic Bound for Non-Uniform Secrets [49, p. 1360]). Let L : Y → L be a random function denoting a leakage, and let Y be uniformly distributed over Y. Then, there exists a constant α such that for all random variables G arbitrarily distributed over Y, the following inequality holds true:

MI(G; L(G)) ≤ α · |Y| · MI(Y; L(Y)). (3)

Moreover, the smallest value α such that Equation 3 holds true belongs to the

interval α ∈

[

log 2 (e) e ,^1 −^ e

− 1

]

≈ [0. 53 , 0 .63].

Theorem 2 introduces an overhead scaling with |Y|, which could decrease the final security level by one or several orders of magnitude (e.g., for the AES, |Y| = 2^8 ). Note that Equation 3 is nearly tight in the general case, in the sense that the range of α is narrow. Shulman and Feder exhibit an example of worst case leakage function, such that Equation 3 becomes an equality, for α ≈ 0. 53 [ 49 ].

The Power Map Trick. However, such worst-case C functions are not likely to be used in cryptographic primitives, since, e.g., the input and output of Sbox are expected to be uniformly distributed, for cryptographic reasons. That is why we refine hereafter the generic statement of Theorem 2, and we present some examples where this refinement could remove the dependency on the field size. To this end, we revisit Theorem 2 by relying on an intermediate result of Shulman and Feder’s proof. Lemma 1 ([49, Lemma 6]). Given a leakage function L and two random variables Y, Y′^ distributed (non-necessarily uniformly) over the finite set Y, and such that the support of Pr

Y′

contains the support of Pr(Y). Then, the following inequality holds: MI(Y; L(Y)) MI

Y′; L(Y′)

) (^) ≥ min y∈Y

Pr(Y = y) Pr

Y′^ = y

As a result, we straightforwardly get the following corollary. Corollary 1. In the same setting as in Lemma 1, if now the support of Pr(Y) contains the support of Pr

Y′

, the following inequality holds true:

MI

Y′; L(Y′)

MI(Y; L(Y))

≤ max y∈Y

Pr

Y′^ = y

Pr(Y = y)

Proof. Straightforward, using Lemma 1 and the identity maxx∈X x = (^) min^1 x∈X (^) x^1

for some finite ordered set X.

We will leverage Corollary 1 in the case where the Sbox is a monomial, i.e. is of the shape y 7 → yk. Admittedly, this makes our proof slightly more specific than Prouff and Rivain’s one, as the latter one can handle any Sbox expressed as a polynomial. Nevertheless, this assumption remains mild, as it covers many Sboxes used in practical ciphers, including the AES, and will allow us to remove a constant factor equal to the field size. We have seen in Algorithms 3 and 4 that the monomial y 7 → yk^ can be computed in the Rivain-Prouff masking scheme by computing intermediate power maps y 7 → yk

′ for some k′^ ≤ k, through some square-and-multiply schemes [ 47 ]. The bound on the leakage induced by such an intermediate computation is handled by the following corollary. Corollary 2. Let Y be a uniform random variable over a finite field Y of size M ≥ 2. For any k ∈ J 1 , M − 1 K, define the function C : y ∈ Y 7 → yk. Let L : Y → L be a δ-MI-noisy leakage. Then:

MI

Y; L(Yk)

M

M − 1

· gcd{k, M − 1 } · δ. (5)

Proof. Using the Data Processing Inequality (DPI) (stated in Lemma 2 in

Appendix A), we are reduced to upper bound MI

Yk; L(Yk)

. To this end, we shall compute the p.m.f. of Yk. The result will then follow from Lemma 1 and

Theorem 3 ([12, Thm. 3.2]). Let M > 2 be an integer. Then, for all  > 0 , we have E k [gcd(k, M )] = O(M ) , where the expectation is taken with respect to k

uniformly distributed in J 1 , M K.

The practical interpretation of Theorem 3 is that if a given exponentiation scheme gives high constant factors, then it should not be hard to modify it, in order to make the constant factor in the right hand-side of Equation 5 arbitrarily low. As a consequence, we may treat the right hand-side of Equation 5 as asymptotically independent of M with high probability. That is why in the remaining of this paper, we will abuse notation by denoting any gcd factor as scaling as O(M ) — which is confirmed on our implementations of interest by Table 2.

3.3 Step 2(b): Reduction to Uniform Secrets for Binary Gates

We have shown in subsection 3.2 how to significantly decrease the loss in the reduction from non-uniform secrets to uniform secrets for leakage coming from unary gates dealing with power maps. In order to have a complete toolbox for reductions to uniform secrets, we also need to deal with leakages coming from gadgets with two input operands, e.g., I.S.W. multiplications. Hereupon, Theorem 2 straightforwardly applies, although spanning a loss of 0. 63 |Y|^2 in the reduction. That is why we may naturally think of extending the power map trick introduced before. But contrary to Theorem 2, Corollary 2 does not extend as straightforwardly for binary gates. Indeed, calculations with more than one operand add another difficulty: not only the operands may not be uniformly distributed, but they might also be non-independent. This results in the following corollary. Corollary 3. Let Y be a random variable uniformly distributed over the finite field Y. For p, q ∈ J 2 , M − 2 K, let Z = (Yp, Yq^ ). Let L : Y^2 → L be a δ-MI-noisy leakage. Then,

MI(Y; L(Z)) ≤

M

M − 1

· min {gcd(p, M − 1) , gcd(q, M − 1)} · M · δ. (7)

Proof. We apply Lemma 1 for the random vector Z′^ = (Y, Y′), where Y′^ is an independent copy of Y. For any x, y ∈ Y, the total probability formula implies that Pr(Yp^ = x, Yq^ = y) Pr

Y = x, Y′^ = y

y′^ Pr(Y

p (^) = x, Yq (^) = y′)

Pr

Y = x, Y′^ = y

Pr(Yp^ = x) Pr(Y = x) Pr

Y′^ = y

Using Equation 6, we get that Pr(Yp^ = x, Yq^ = y) Pr(Y = x) Pr

Y′^ = y

M

M − 1

· gcd(p, M − 1) · M. (8)

By symmetry, we can obtain the same bound by permuting the roles of p and q, which gives Equation 7.

Remark 1. Note that the inequality in Equation 8 is tight, e.g., if p divides q, or inversely. Likewise, we argued that Equation 3 is generally tight — unless considering further assumptions on the prior distribution. Nevertheless, both facts do not necessarily imply that Equation 7 is tight. Whether the latter inequality could be refined for binary gates with non-independent operands remains an open-question that we will briefly discuss in subsection 3.4.

3.4 Step 3: The Amplification Theorems

We now revisit the third step of Prouff & Rivain’s approach. To this end, like in subsection 3.2 and subsection 3.3, we make a discrepancy between the unary gates and the binary gates.

For Unary Gates. The following amplification theorem is at the core of our direct proof in the noisy leakage model, and holds the name of Mrs. Gerber’s Lemma (MGL). It has initially been stated by Wyner and Zyv [ 53 ] for binary random variables, and has been recently extended by Jog and Anantharam to random variables in Abelian groups whose size is a power of two [ 34 ]. This result has recently been pointed out to the SCA community by Béguinot et al. at Cosade 2023 [14].

Theorem 4 (Mrs. Gerber’s Lemma (MGL) [14, Cor. 1]). Let |Y| = 2n for some bit-size n and d be a positive integer. Let Y 0 ,... , Yd be a (d+1)-encoding of the uniform random variable Y over Y, and L = (L 0 ,... , Ld) be such that, conditionally to Yi, the variable Li is independent of the others. Assume that for all i ∈ J 0 , dK, MI(Yi; Li) ≤ δi for some parameter 0 ≤ δi ≤ 1. Then

MI(Y; L) ≤ fMI(δ 0 ,... , δd) , (9)

where fMI(·) is Mrs. Gerber’s function.

We refer to the works of Béguinot et al. for more details about Mrs. Gerber’s function [ 14 ]. In our context, we only need the properties summarized hereafter.

Proposition 1 (The MGL function [14, Thm. 1, Prop. 3]). The Mrs. Gerber’s Lemma (MGL) function fMI(·) is concave with respect to any of its variables, when the remaining ones are kept fixed. Let η = (2 log 2)−^1 ≈ 0. 72. Then for all δ 0 ,... , δd ∈ [0, 1], we have

fMI(δ 0 ,... , δd) ≤ η

∏^ d

i=

δi η

For Binary Gates. We now extend Béguinot et al.’s Theorem 4 to the case of binary gates, as stated hereafter by the following theorem that we prove in Appendix B.1, following a similar outline as Prest et al. [44, Thm. 6].

{Li,j } 0 ≤i,j≤d by L, and denote ϕ(p, q, M ) = min(gcd(p, M − 1) , gcd(q, M − 1)). Then we have:

MI(Y; L) ≤ 2 · |Y| ·

|Y|

|Y| − 1

· ϕ(p, q, |Y|) · η ·

(d + 1) · δ η

)d+

. (14)

Proof. Combining Theorem 5 with Corollary 3.

It now remains to give some upper bounds for type 4 subsequences. These subsequences can be observed in the compression phase of I.S.W. multiplications (after cross-products and refreshings). This is the aim of the following result that we prove in Appendix B.1. Theorem 6. Let Y 0 ,... , Yd be d + 1 independent uniformly random variables over a finite set Y. Let L 1 ,... , Ld be a family of δi-MI leakage functions, defined over Y × Y, for some 0 ≤ δi ≤ 1. We have:

MI(Yd; L 1 (Y 0 , Y 1 ),... , Ld(Yd− 1 , Yd)) ≤ δd. (15)

Corollary 7 (Type 4 subsequences). Let Y be a secret, such that for p, q ∈ N the product of the multiplication Yp^ × Yq^ is processed by an I.S.W. gadget. For 0 ≤ i, j ≤ d and for Ti,j , Vi,j ∈ Y, let L = {Li,j (Ti,j− 1 , Vi,j )} 0 ≤i,j≤d denote the corresponding type 4 leakages such that for all i, j, the leakage Li,j (Ti,j− 1 , Vi,j ) is δi,j -MI-noisy, for δi,j ≤ δ ≤ 1. Then the following inequality holds true:

MI(Y; Li,j (Ti,j− 1 , Vi,j ) 0 ≤i,j≤d) ≤

|Y|

|Y| − 1

· gcd(p + q, M − 1) · η ·

δ η

)d+ . (16)

Proof. Using Corollary 2, we reduce to the case where Yp^ × Yq^ is uniformly distributed over Y, inducing a gcd(p + q, M − 1) factor overhead. Then, by gathering the leakages Li,j sharing the same index i by batches, we may notice that each batch of index only depends on one share of Y. We may therefore invoke Theorem 4 as follows:

MI(Y; L) ≤ f (δ 0 ′,... , δ d′) , (17)

where δ i′ = MI

Yi; {Li,j (Ti,j− 1 , Vi,j )} 0 ≤j≤d

. Finally, we can upper bound each δ i′ by δi,d using Theorem 6.

3.6 From Subsequences to a Complete Computation.

We can now combine the three previous steps to state the main result, in a similar way as Prouff and Rivain [45, Thm. 4] and as Prest et al. [44, Sec. 6.3].

Theorem 7. Consider a Y-block cipher with monomial Sboxes, where a sequence of elementary calculations depends on a random variable Y uniformly distributed. Assume that these elementary calculations are protected by a d-encoding masking

scheme as described in subsection 2.2, resulting in T elementary calculations giving access to the leakage L = (Li) 1 ≤i≤T , where each leakage function Li is assumed to be δ-MI-noisy. Then, the following inequality is verified:

MI(Y; L) ≤ t 3 · η ·

(d + 1)δ η

)d+

  • t 1 , 2 , 4 · η ·

δ η

)d+ ,

such that

t 3 =

(p,q)∈M

ϕ(p, q, |Y|) , t 1 , 2 , 4 =

(p,q)∈M

φ(p, q, |Y|) +

k∈S

ψ(k, |Y|) , (18)

where M is the sequence of pairs (p, q) of exponents in the operands of the I.S.W. multiplication gadgets, S is the sequence of exponents (k) of operands over which a linear transformation is applied, and

  • ϕ(p, q, M ) = 2 · M · (^) MM − 1 · min(gcd(p, M − 1) , gcd(q, M − 1)),
  • φ(p, q, M ) = (^) MM − 1 · gcd(p + q, M − 1),
  • ψ(k, M ) = gcd(k, M − 1).

Proof. We apply Theorem 1 to decompose the MI into a sum of MIs for each subsequence. Since by assumption Y is uniformly distributed over Y, Corollaries 4, 5, 6, 7 directly apply to bound each term in the sum.

Note that in (18), t 3 = O

|Y|1+^ · |M|

, and t 1 , 2 , 4 = O(|Y|^ · (|M| + |S|)).

Corollary 8. For any random-plaintext SCA key recovery adversary targeting a Y-block cipher protected by the masking scheme described in subsection 2.2, the efficiency verifies the following bound:

N (^) a? (SR) ≥

f (SR, |Y|) t 3 + t 1 , 2 , 4

η

η (d + 1)δ

)d+ ,

where f (SR, M ) = log 2 (M ) − (1 − SR) log 2 (M − 1) − H 2 (SR), where H 2 is the binary entropy function, and where the constants t 3 and t 1 , 2 , 4 are the ones defined in Theorem 7.

Proof. Chérisey et al.’s security bound allows to link the SCA key recovery efficiency to the MI between Y = K ⊕ P and the corresponding leakage:

N (^) a? (β) ≥ f (SR, |Y|) MI(Y; L)

Plugging Theorem 7 into the latter inequality gives the result.

In other words, any random plaintext attack on the masked implementation

will require at least Ω

|Y|−(1+)^ · log |Y| ·

η (d+1)δ

)d+1) queries to the target device.

leveraging the identity: G × H = (G ⊕ R) × H ⊕ (R × H). At first glance, one may think that this trick somewhat shifts the problem without fixing it, since the input operands of both I.S.W. are independent, but now the input operands of the final Xor in line 8 of Algorithm 5 are no longer independent. Surprisingly, the prior joint distribution of the outputs (M, H′) of the two I.S.W. multiplications has a much lower bias with respect to the joint uniform distribution, compared to the bias of the joint distribution of (G, H). This is formalized in the following theorem, proven in Appendix B.2.

Theorem 8. Let Y ∈ Y be uniformly distributed, and let L corresponding to the leakage of Algorithm 5. Then, assuming leak-free refreshings, and that g(Y) = Yp and h(Y) = Yq^ , for p, q positive integers, the following inequality is satisfied:

MI(Y; L) ≤ ϕ(p, q, |Y|) · η ·

(d + 1)δ η

)d+

  • φ(p, q, |Y|) · η ·

δ η

)d+ ,

where

  • ϕ(p, q, M ) = 4 · (^) MM − 1 · gcd(q, M − 1),
  • φ(p, q, M ) = 4 + (^) MM − 1 · gcd(p, M − 1) + max

2 , (^) MM − 1 · gcd(p + q, M − 1)

Note that in Theorem 8, both ϕ and φ are almost independent of the field size, whereas t 3 in Theorem 7 scales at least linearly with the field size. From Theorem 8 follows the corollary stated hereafter.

Corollary 9. In the same setting as in Corollary 8, if the I.S.W. multiplication gadgets are replaced by the scheme in Algorithm 5, then

N (^) a? (SR) ≥ Ω

|Y|−^ · log |Y| ·

η (d + 1)δ

)d+1) .

Proof. The proof follows the one of Corollary 8, by updating the functions ϕ and φ in Equation 18 with the new values in Theorem 8.

5 Discussion

We have established our main results in section 3 and section 4. We propose hereafter to discuss some features of our results, and to compare them to previous works. To this aim, we first compare in subsection 5.1 our bounds to previous works. We then discuss in subsection 5.2 how we can extend our results to security bounds in terms of chosen plaintext attacks. We conclude this section by discussing the advantages and drawbacks of the blinded I.S.W. gadget presented in section 4.

5.1 Comparison with Related Works

We compare in this section our security bounds with related works. To this end, we first discuss the noise requirements in the different security bounds in the literature. We synthesize in Table 3 the different noise requirements of masking security bounds. We can see that our security bound gets a similar noise requirement as the proofs of Dziembowski et al. [ 25 ] and Prest et al. [ 44 ], although stated in different metrics. Notice that the dependency of our noise requirement in d is tight, since it depicts the potential ability of an adversary to increase its success of recovering each share through horizontal attacks, as argued by Battistello et al. [ 7 ] and Grosso and Standaert [ 31 ]. Nevertheless, it is still possible to relax this dependency by using other multiplication gadgets [1,3,2,8,28,29]. Moreover, we also extend Prest et al.’s case study on the exemplary leakage distribution in which each intermediate calculation is assumed to leak its Hamming weight with an additive Gaussian noise of standard deviation σ [ 44 , Table 1]. We complete Table 3 with our new result, by using the fact that for such a

leakage model, MI = Θ

log(M ) σ^2

. It can be noticed that on this particular leakage distribution, our requirement on the minimal noise level is now the weakest of all security proofs based on the I.S.W. masking scheme.

Table 3: Noise requirements, and illustration on a case study on a Hamming weight leakage model with additive Gaussian noise. Work (year) Noise requirement Equivalent Gaussian noise

[45] (2013) EN ≤ O

( (^1) dM 3

) σ ≥ Ω

( dM 5 /^2

√ log(M )

)

[23] (2014) SD ≤ O

( (^1) dM 2

) σ ≥ Ω

( dM 2

√ log(M )

)

[25] (2015) SD ≤ O

( (^1) d

) σ ≥ Ω

( d

√ log(M )

)

[44] (2019) RE ≤ O

( (^1) d

) σ ≥ Ω(d log(M ))^7

This work MI ≤ O

( (^1) d

) σ ≥ Ω

(√ d log(M )

)

At first glance, Table 3 suggests that Prest et al.’s RE-based security bound remains quite competitive with the other works based on the noise requirements. However, we emphasize that the RE is a worst-case metric, whereas all the other metrics in Table 3 are averaged metrics. Estimating worst-case metrics may not always be efficiently tractable by practitioners, especially for high-dimensional leakage. In addition, worst-case metrics are by definition much more conservative than averaged metrics, which contrasts with the concrete SCA security metrics like the GE or the SR [ 50 ] that are also averaged metrics. To illustrate this, let us (^7) As explained by Prest et al. [ 44 , Remark 2], the RE is not even formally defined for leakage models with Gaussian noise, unless requiring to a tail-cut argument that adds another constant factor hidden in the Ω(·) notation.