Fake Lepton - Background Samples - Background Estimation

Background Estimation

5.2 Background Samples

5.2.8 Fake Lepton

We define leptons, which originate fromW andZ boson decays, as real leptons. On the other hand, we define leptons, which originate from photon conversions and B hadron semileptonic decays or misidentified jets, as fake leptons. The background process consists of fake leptons and multiple jets. We use a data-driven method known as the matrix method to determine the normalisation of the fake lepton events.

The matrix method for the event with a single lepton is roughly described as follows. We define two criteria, loose and tight, and prepare two data sets corresponding to the criteria related with isolation and identification. The data set of the loose selection contains the larger number of fake leptons than that of the tight selection. On the other hand, the data set of the tight selection contains the larger number of real leptons than that of the loose selection. Equation (5.4) shows the number of events passing the loose selection,N^L, and events passing the tight selection,N^T,

N^L=N_fake^L +N_real^L , (5.4)

N^T =N_fake^T +N_real^T , (5.5)

whereN_fake^L is the number of fake lepton events after the loose selection,N_real^L is the number of real lepton events after the loose selection,N_fake^T is the number of fake lepton events after the tight selection,N_real^T is the number of real lepton events after the tight selection.N^T is a subset ofN^L andN^T < N^L. Then, we define the ratio ofN_real^L toN_real^T ,εr, and the ratio ofN_fake^L toN_fake^T ,ε_f.

ε_r= N_real^T

N_real^L , ε_f = N_fake^T

N_fake^L . (5.6)

ε_rindicates the reduction rate of real leptons between the loose selection and the tight selection.

ε_f indicates the reduction rate of fake leptons between the loose selection and the tight selection.

We measure theε_rand theε_f by collecting two data sets. The estimation of the real efficiencyε_r and the fake efficiencyε_f is different between the electron and the muon channel. The number of events for fake leptons passing the tight selection,N_fake^T , is estimated by

N_fake^T = ε_f

ε_r−ε_f(εrN^L−N^T). (5.7)

Both efficiencies εr and ε_f depend on variables associated with lepton kinematics like p^lepton_T and event characteristics like the number ofb-tagged jets. The efficiencies are parametrised as functions of various object kinematics. Then, an event weight is expressed by

w_i = ε_f

εr−ε_f(ε_r−δ_i), (5.8)

whereδ_iequals unity if the loose eventipasses the tight event selection and 0 otherwise. The sum ofw_iover all events in a given bin of the final observable is the number of the fake leptons in that bin.

The matrix method for events on the dilepton selection is roughly described as follows. We label the numbers of observed events with two tight leptons asN_tt, of observed events with one loose and one tight lepton asN_tl andN_lt, of observed events with two loose leptons asN_ll. The leading leptons in theN_tl region are classified into tight leptons and the leading leptons in theN_lt region are classified into loose leptons. By using the efficienciesε_r andε_f, linear equations are obtained for the observed yields as a function on the number of events with two real leptonsNrr, of events with one real lepton and one fake leptonN_rf andN_{f r}, of events with two fake leptons N_{f f}:





 N_rr N_{f r} N_rf N_{f f}





=M⁻¹





 N_tt N_tl N_lt N_ll





, (5.9)

whereMis a4×4matrix written in terms ofε_randε_f. The matrix is expressed by







εr,1εr,2 εr,1ε_f,2 ε_f,1εr,2 ε_f,1ε_f,2 ε_r,1ε_r,2 ε_r,1ε_f,2 ε_f,1ε_r,2 ε_f,1ε_f,2 ε_r,1ε_r,2 ε_r,1ε_f,2 ε_f,1ε_r,2 ε_f,1ε_f,2 εr,1εr,2 εr,1εf,2 εf,1εr,2 εf,1εf,2





, (5.10)

where the indexes 1, 2 onε_randε_f refer to the leading or sub-leading lepton in the event, respec-tively, andεstands for(1−ε). We obtain four event weights;w_rr,w_rf,w_{f r}, andw_{f f}. An event with two loose leptons contains at least one fake lepton, therefore, a probability for the event is given byw_rf +w_{f r}+w_{f f}. The event weight with two tight leptons is expressed by

w_tt =ε_r,1ε_f,2w_rf+ε_f,1ε_r,2w_{f r}+ε_f,1ε_f,2w_{f f}. (5.11) We evaluate the efficiencies by measuring the contribution of the fake lepton to the top quark pair production and the single top quark production processes in pp collision events at√

s = 8 TeV [56]. In the event selections, we require the single electron trigger, which is labelled as e24vhi and e60, or the single muon trigger, which is labelled as mu24i and mu36. We define the signal region, which is used for the validation and the estimation of systematic uncertainties, and the control region, which is used for the estimation of the efficiencies. For the validation, we use the simulation samples of the signal processes and the background processes, which are Z/W+jets, diboson and dijet. Both regions have the lepton plus jets channel, which is classi-fied into e+jets and µ+jets channels, and the dilepton channel. By applying cuts to the num-bers of jets andb-tagged jets, the signal region for the lepton plus jets channel is separated into two regions, at least 4 jets and pretag, at least 4 jets and at least oneb-tagged jet, where “pre-tag” indicates that there is no criteria on the number of b-tagged jets. In order to suppress the

background from fake leptons in the two signal regions, we also require exactly one tight lep-ton and the criteria on the missing transverse energyE_T^missand the transverse mass m^W_T ; where m^W_T =

2p^lepton_T E_T^miss(1−cos ∆φ)with ∆φ, which is a difference in an azimuthal angle be-tween the lepton andE_T^miss. The selected events in the e+jets channel satisfy E_T^miss > 30 GeV and m^W_T > 30 GeV. The selected events in the µ+jets channel satisfy E_T^miss > 20 GeV and E_T^miss+m^W_T >60GeV. The dilepton channel is separated into the same flavour channel,e⁻e⁺ andµ⁻µ⁺, and the different flavour channel,eµ. All regions require exactly two opposite-sign charge (OS) leptons and exactly two jets. The event selection for the same flavour channel re-quires the cuts on the dilepton invariant massm_ll,m_ll >15GeV and|m_ll−mZ|>10GeV. In addition to this,E_T^miss >60GeV is applied. The event selection for the different flavour channel requires the cut on the scalar sum of the transverse energy of leptons and jetsH_T,H_T>130GeV.

The signal region in each channel is classified into the pretag region and at least oneb-tagged jet region.

In the control region, we measure the efficienciesε_randε_f. Theε_f is derived from the control regions in thee+jets and theµ+jets channels. The control region for the fake efficiency is denoted as CR_f. Theε_ris derived from the control regions in thee⁻e⁺and theµ⁻µ⁺channels. In theε_r measurement, we utilize the tag-and-probe method, which produces an unbiased sample of loose leptons from particle decays (probe leptons) by applying the tight selection requirement on the other leptons from the decays (tag leptons), by usingZ → e⁻e⁺ andZ → µ⁻µ⁺ events. We determine the efficiencyε_r by applying the tight selection to the probe leptons. For each pair, we require that the tag and the probe leptons have opposite-sign charges and the dilepton invariant mass is80< m_ll<100GeV. In theε_f measurement, we require selections to make a sample with many fake leptons. The event in the CR_f has only one loose lepton and at least one jet. For the e+jets channel, we also require thatm^W_T <20GeV andE_T^miss+m^W_T <60GeV. For theµ+jets channel, we also require that|d^sig₀ |>5whered^sig₀ is the muon impact parameter significance and calculated byd^sig₀ =d₀/p

err(d₀). The fake efficiency is derived from the ratio of the number of events with tight leptons and the number of events with loose leptons in the CR_f.

Both efficienciesεrandεf are measured as functions of different variables such as the lepton p_T, lepton |η|, the angular distance between the lepton and the closest jet min∆R(l, jet), the number ofb-tagged jets, the trigger options. Figures 5.5 and 5.6 show the real efficiency ε_r and the fake efficiencyε_f in thee+jets channel as functions of the different variables. Figures 5.7 and 5.8 show the real efficiencyε_rand the fake efficiencyε_f in theµ+jets channel as functions of the different variables.

Figure 5.5: Real efficiency ε_r and fake efficiency ε_f in the e+jets channel as functions of the different variables and the trigger options. The variables are electron cluster eta |η|^e, electron transverse energy p^e_T, and the minimum ∆R between electron and jets. e60 indicates high p_T trigger, e24vh indicates lowp_T trigger without the isolation cut, e24vhi indicates lowp_T trigger with the isolation cut.

Figure 5.6: Real efficiency ε_r and fake efficiency ε_f in the e+jets channel as functions of the different variables and the trigger options. The variables are leading jetpTpleading jet

T , the number of jetsn_jet, the number ofb-tagged jetsn_b-jet, and the angle in the transverse plane between the electron and the MET∆φ(e, E_T^miss). e60 indicates highp_Ttrigger, e24vh indicates lowp_Ttrigger without the isolation cut, e24vhi indicates lowpTtrigger with the isolation cut.

Figure 5.7: Real efficiency ε_r and fake efficiency ε_f in theµ+jets channel as functions of the different variables and the trigger options. The variables are muon eta |η|^µ, muon transverse momentump^µ_T, and the minimum∆Rbetween muon and jets. mu36 indicates highp_T trigger, mu24 indicates lowp_T trigger without the isolation cut, mu24i indicates lowp_T trigger with the isolation cut.

Figure 5.8: Real efficiency ε_r and fake efficiency ε_f in theµ+jets channel as functions of the different variables and the trigger options. The variables are leading jetp_Tpleading jet

T , the number of jetsn_jet, the number ofb-tagged jetsn_b-jet, and the angle in the transverse plane between the muon and the MET∆φ(µ, E_T^miss). mu36 indicates highpTtrigger, mu24 indicates lowpTtrigger without the isolation cut, mu24i indicates lowp_Ttrigger with the isolation cut.

The efficiencies expressed by functions of different variables are used for the weight calculations in Equation (5.8). There is correlation between the variables used for the efficiency measurements.

Therefore, the efficiency is expressed as a function of the different combinations of the variables:

ε_k(x₁, ..., x_N;y₁, ..., y_M) = 1

ε_k(x₁, ..., x_N)^M⁻¹ · YM

j=1

ε_k(x₁, ..., x_N;y_j). (5.12)

whereε_k(x₁, ..., x_N)represents the efficiency measured as a function of all thexvariables, and ε_k(x₁, ..., x_N;y_j) represents the efficiency measured as a function of all thex variables and of the variabley_j. The variablesx are typically discrete variables and the variablesy are typically continuous variables. In Equation (5.12), we assume that there is no correlations between the variablesy.

The main source of systematic uncertainties on the fake lepton background estimation comes from the measurement of the real efficiency, the use of MC simulation to correct the efficiency measurements, the different background composition in the signal regions, the treatment of the dependence of the efficiencies on lepton and event properties.

ドキュメント内 ATLAS実験における重心系衝突エネルギー8 TeVでの陽子-陽子衝突のデータを用いたトップクォークとボトムクォークに崩壊する荷電ヒッグス粒子の探索 (ページ 146-151)