I've been having a look at event selection; the Thorsten-Kuhl paper starts off with a cut based selection and then uses a likelihood selection. For the cut based selection they used:
- Two leptons found (electron identification is performed before this). At the moment I'm cheating the electron identification from Monte Carlo because the electron ID was getting messy due to the extra reconstructed neutrals.
- The invariant mass of the two leptons has to be between 70 GeV and 110 GeV.
- Cos(theta) of the event's primary thrust axis less than 0.9.
- The invariant mass of the two hadronic jets has to be greater than 100 GeV.
This plot shows these variables for the ZZ sample and the Higgs to c cbar sample. There's about twice as many events for the ZZ sample. The peak at zero for primary lepton mass is because I haven't cut out the cases where one electron is found; and the low energy rise in the ZZ sample is because it's a Z/gamma* sample.
This plot is the number of events that pass these cuts, with 0 being the number of initial events and 1 to 4 the amount after the respectively numbered cuts above.
For the likelihood selection I've been looking at the following inputs:
- Cos(theta) of the event's primary thrust axis.
- The event's thrust.
- The invariant mass of the two hadronic jets.
- The energy difference of the two hadronic jets.
The Thorsten-Kuhl paper also uses the invariant mass of the two jets after applying a kinematic fit, but I haven't coded that up yet.
This plot and
this plot show these four distributions normalised to 1. For the likelihood I used these plots as reference distributions, and used the height for a given measurement to get a "probability" of getting that measurement for signal or background (since they're normalised to one). I looked at using just the signal references, basically just multiplying these "probabilities" together, and also by using a bastardisation of Bayes' theorem to include the background. For that I did the same as before but divided by the sum of the background and signal "probabilities". To be mathematically sound I should probably sum over all possible backgrounds, but then I'm a physicist not a mathematician.
This plot shows the efficiency purity for the two methods (efficiency on x-axis). Note that because I've not got particularly large samples this plot was made with the same events as the reference distributions. A bit shoddy but if I can't get it to work like that then I may as well give up now. There are 6375 ZZ events in the background, and 5864 signal (slightly more c cbar than b bbar).
No comments:
Post a Comment