Identification Capacity - The Primitive BIS

2.3 The Primitive BIS

2.3.2 Identification Capacity

Willems et al. [76] applied information theoretic methods to investigate the maximum achievable rate of the individuals with vanishing decoding error probability. We first note that RVsX_iⁿ,Y_iⁿ, andZⁿ form a Markov chainY_iⁿ−X_iⁿ−Zⁿ[14], and thus the joint probability distribution among them can be written as

PX_iⁿY_iⁿZⁿ(xⁿ_i,yⁿ_i,zⁿ) =PX_iⁿ(xⁿ_i)P_Yⁿ

i|X_iⁿ(yⁿ_i|xⁿ_i)P_Zⁿ_|Xⁿ

i (zⁿ|xⁿ_i)

∏

k=1

P_Y|X(yik|x_ik)PX(xik)PZ|X(zk|x_ik), (2.19) where the last equation in (2.19) is due to the i.i.d. structure of each symbol. Then, from the marginal logic, it can be easily derived that

PY_iⁿZⁿ(yⁿ_i,zⁿ) =PY_iⁿ(yⁿ_i)P_Zⁿ_|Yⁿ

i (zⁿ|yⁿ_i)

∏

k=1

P_Y(yik)PZ|Y(zk|y_ik), (2.20) whereP_Y andP_Z|Y can be computed as

P_Y(y) =

∑

x∈X

∑

z∈Z

P_X(x)P_Y|X(y|x)P_Z|X(z|x), (2.21) P_Z|Y(z|y) =

∑

x∈X

P_X(x)PY|X(y|x)P_Z|X(z|x)

PY(y) . (2.22)

Equation(2.20)implies thatP_Z|Y :Y → Zalso forms a DMC and each enrollment output sequence yⁿ_i can be viewed as an i.i.d source sequence.

16 Preliminaries Remark 2.1. Willems et al. [76] observed that the set of all the enrollment output sequences (a database) can be seen as a code whose codeword corresponding to message i is yⁿ_i. In the identification phase, these codewords yⁿ_i (1≤i≤MI)are observed via a DMC P_Z|Y :Y → Z.

Remark 2.1 indicates that we can use the same techniques for analyzing the error probability in channel coding systems to bound the error probability in the BIS. The identification rate of the BIS with the block lengthnandM_I individuals is denoted by

R_I= 1

nlogM_I. (2.23)

Definition 2.4. An identification rate R_I (RI ≥0)is said to be achievable if forδ >0and large enough n there exists a decoder d such that

maxi∈I Pr{Wb ̸=W|W =i} ≤δ, (2.24) 1

nlogMI≥RI−δ. (2.25)

Moreover, the identification capacity C of the BIS is the supremum of all achievable identification rates RI. That is,

C=sup{R_I | R_I is achievable}.

Now, we are in a position to introduce the identification capacity. In [76], Willems et al. proved the following theorem.

Theorem 2.1. (Willems et al. [76])

The identification capacity of the BIS is given by

C=I(Y;Z), (2.26)

where I(Y;Z) denotes the mutual information between RVs Y and Z with the joint probability distribution P_ZY(z,y) =

∑

x∈X

P(x)P_Y|X(y|x)P_Z|X(z|x) (∀y∈ Y,∀z∈ Z).

The above theorem is a convincing result because the set of {Y₁ⁿ,· · ·,Y_Mⁿ

I}can be viewed as a random code. When a codeword of this random code is sent via the channelP_Z|Y, which is a compound channel consisting of the backward channel of the channel in the enrollment phase and the identification channel, the maximum achievable rate that the index of the sent message can be reliably estimated is the mutual information between RVsY andZ. This is a simple result, but it has been a huge influence for all the relevant studies taken place later on.

Chapter 3

BISs Supporting Authentication

As mentioned in Chapter 1, the BIS supporting authentication was first investigated by Ignatenko and Willems for the generated-secret model [32] and for both the generated- and chosen-secret BIS models [33]. They aimed to maximize the identification and secrecy rates under a privacy-leakage constraint for the VSM. However, in real-life applications, when user enrolls her bio-data for future identification, the identity needs to be scanned and sent into the system database. During these processes, it is likely that noise is added to the original bio-data. Therefore, it is more natural to analyze the BIS under the circumstance that the enrollment channel is noisy. Another interesting thing is that when the model switches from the VSM to the HSM, the problem becomes more challenging.

Especially, the evaluation of the privacy-leakage rate as the template is not a function of original bio-data, but a noisy version of it. Basically, many techniques used to investigate the fundamental trade-off in the VSM are not directly applicable to the HSM as claimed in [21]. Indeed, we are greatly motivated by these facts to improve the models considered in [33] to include the HSM.

In this chapter, we focus on clarifying the fundamental trade-off among identification, secrecy, template, and privacy-leakage rates of the generated- and chosen-secret BIS models. Both models are analyzed under the assumption of the HSM. Here, we wish to maximize the identification and secrecy rates under privacy and storage constraints. In order to get closer to practical system, we analyze the region by imposing the following requirements:

1) the enrollment channel is a noisy,

2) a scheme of both protecting privacy (e.g., [33], [21]) and compressing template (e.g., [69], [81]) is considered,

3) the capacity region is analyzed under the condition that the prior distribution of an identified individual is unknown.

In the achievability proof, we deploys layered binning technique to reduce the rate of database, and this enable us to make the error probability arbitrarily small. To handle the difficulties of bounding the privacy-leakage in the achievability proof, we introduce avirtualsystem with apartialdecoder, which outputs only the secret data of individual, and use Lemma 2.2. In the converse proof, we

18 BISs Supporting Authentication relax the problem to be the one where the prior distribution of the identified user is uniform, and use Fano’s inequality together with the assistance of auxiliary RVs, which is quite standard technique for analyzing the outer bound of the capacity region. We show that there are two different ways to express the capacity regions of the generated- and chosen-secret BIS models. An expression uses only a single auxiliary RV and another requires two auxiliary RVs. Later, we will demonstrate that the two regions (regions with one and two auxiliary RVs) are technically identical in Remark 3.5. Although there are two different aspects, we provide the proof of our main result based on the one employing two auxiliary RVs. Some benefits of deriving via two auxiliary RVs are that the achievability proof can be done in a simpler form since each rate constraint is addressed individually. The characterization of the capacity regions of the models is basically similar to the ones given in [21], [33], and [81]. As special cases, it can be checked that our characterization reduces to the one given by Ignatenko and Willems [33] where the enrollment channel is noiseless and there is no constraint on the template rate, and it also coincides with the result derived by Günlü and Kramer [21] where there is only one individual, and thus individual’s estimation is not necessary.

The rest of this chapter is organized as follows. In Section 3.1, we describe the details of the system model considered in this chapter. In Section 3.2, we present our main results. Next, we provide the detailed proofs of the main results in Section 3.4 and Section 3.5. Finally, in Section 3.6, we give summary of results and discussion.

3.1 System Model

In this section, we explain the system models considered in this chapter. For the detailed explanations of the processes of generating bio-data sequenceX_iⁿand observingY_iⁿandZⁿ, the readers should refer to Section 2.3. Here, we only provides the new parts. We start with describing the generated-secret BIS model and then the chosen-secret BIS model.

ドキュメント内電気通信大学学術機関リポジトリ (ページ 31-34)