The events used for the previous post were required to have the following:
Number of electrons >= 2
84 < di-electron mass < 105 GeV
100 < di-jet mass < 130 GeV
cos(theta) thrust axis< 0.9
115 < recoil mass < 160 GeV
Number of reconstructed particles >= 45
The 'number of reconstructed particles' cut I took quite high; my signal/sqrt(signal+background) was quite low and I wanted to improve it. Around 17 can be used without losing a single partonic higgs event, maybe I should have used that and then tried relying on the likelihood cut to get rid of everything else.
Anyway, I'll try Hajrah's cuts and see what I get.
The plots I posted this morning have also been changed to have titles, and the data is split down into the constituents for clarity.
Thursday, February 26, 2009
I've been playing around with the cuts, which has improved signal to background but the fit still isn't very good for the c-tag here.
The templates are here:
data
reference
I think it's just the statistical fluctuations mean the c-tag between the test and reference samples don't look particularly similar. The previous plot I had that Roberval commented on being particularly good was for the whole sample; with these new cuts the combined distribution looks similar (but not quite as good).
I've loosened the lower di-jet mass cut which seems to have allowed more b from (presumably) Z decays through, as can be seen by the higher b peak in the background than I was getting before. I thought this might mean the 3D fit (with the bc-tag added in too) might be better, because the background would be distinguishable from the ccbar and gg templates. The fit is below, marginally better but nothing to shout about.
here.
The b vs. bc templates are below. Ignore the axis labels, the c-likeness is actually the bc-likeness.
data
reference
The templates are here:
data
reference
I think it's just the statistical fluctuations mean the c-tag between the test and reference samples don't look particularly similar. The previous plot I had that Roberval commented on being particularly good was for the whole sample; with these new cuts the combined distribution looks similar (but not quite as good).
I've loosened the lower di-jet mass cut which seems to have allowed more b from (presumably) Z decays through, as can be seen by the higher b peak in the background than I was getting before. I thought this might mean the 3D fit (with the bc-tag added in too) might be better, because the background would be distinguishable from the ccbar and gg templates. The fit is below, marginally better but nothing to shout about.
here.
The b vs. bc templates are below. Ignore the axis labels, the c-likeness is actually the bc-likeness.
data
reference
LOI Documents
Just a reminder that the LOI documents for the analysis (draft notes etc.) can be found on the ILCILD wiki.
Wednesday, February 25, 2009
Monday, February 23, 2009
Chi2 fit
Results:
Removed the bins with less than 10 events from the chi^2 function, included statistical fluctuation effects from the Monte Carlo samples, histograms with 10x10 bins, signal samples only.
r_bb = 0.981 +- 0.031
r_cc = 1.13 +- 0.28
r_gg = 1.07 +- 0.22
r_bkg = 1.0 (fixed)
Removed the bins with less than 10 events from the chi^2 function, included statistical fluctuation effects from the Monte Carlo samples, histograms with 10x10 bins, signal samples only.
r_bb = 0.981 +- 0.031
r_cc = 1.13 +- 0.28
r_gg = 1.07 +- 0.22
r_bkg = 1.0 (fixed)
Possibility to fit with chi2 function
I was studying the method to fit with finite MC samples as in Barlow & Beeston, Comp Phys Comm 77 (1993) 219, and I face the following problems:
Facing this difficulties I started to think of the possibility to use the chi^2 fit. Our data sample can have bins with small number of events, but are those bins important for the fit? Ignoring the effects from statisitcal fluctuations, I checked how much the likelihood function changes if data bins with less than 10 events are removed (in both data and Monte Carlo, only bins with non-zero entries are considered). The likelihood function to be maximised (ommiting the constant factorials) is
where d_i is the number of events in bin i of the data sample and f_i is the number of events in bin i from the fit "function".
The plot below shows the likelihood function divided by the likelihood function for d_i >= 10 as a function of the fit parameters r_xx. The ratio is very stable in a range of the parameters of the size or larger than the "expected" errors for these. This means that the likelihood function considering only bins with 10 or more events is essentially the original likelihood times 1.011. The maxima and the errors will be the same.
We can use the chi^2 approximation if we neglect the bins with less than 10 events in the data sample.
What is not clear to me is the case where a bin from the MC samples has small number of events. If I remove the bins of the MC sample with less than 10 events, the all curves above are still flat but goes up to 1.014.
- Treatment of errors from the fit;
- Bins with zero entries;
- Method is valid only when the number of events in a bin of a template is much smaller than the total number of events in the histogram.
Facing this difficulties I started to think of the possibility to use the chi^2 fit. Our data sample can have bins with small number of events, but are those bins important for the fit? Ignoring the effects from statisitcal fluctuations, I checked how much the likelihood function changes if data bins with less than 10 events are removed (in both data and Monte Carlo, only bins with non-zero entries are considered). The likelihood function to be maximised (ommiting the constant factorials) is
where d_i is the number of events in bin i of the data sample and f_i is the number of events in bin i from the fit "function".
The plot below shows the likelihood function divided by the likelihood function for d_i >= 10 as a function of the fit parameters r_xx. The ratio is very stable in a range of the parameters of the size or larger than the "expected" errors for these. This means that the likelihood function considering only bins with 10 or more events is essentially the original likelihood times 1.011. The maxima and the errors will be the same.
We can use the chi^2 approximation if we neglect the bins with less than 10 events in the data sample.
What is not clear to me is the case where a bin from the MC samples has small number of events. If I remove the bins of the MC sample with less than 10 events, the all curves above are still flat but goes up to 1.014.
Sunday, February 15, 2009
Thursday, February 12, 2009
Fitting problems
I tried the extended likelihood fit in RooFit but couldn't get it to minimise, although I didn't spend huge amounts of time on it. I also tried implementing the method myself but it gets unstable for large binning.
As a fall back I decided to get results with the chi2 fit but with the combined errors from the templates and data, but results aren't good for ILD_00. The templates look fairly good though; template is here and data is here. That's for a polarisation of 80-30 for electrons-1 positrons+1.
These are the results I get. I'm currently playing around to see if changing the polarisation or luminosity changes anything.
As a fall back I decided to get results with the chi2 fit but with the combined errors from the templates and data, but results aren't good for ILD_00. The templates look fairly good though; template is here and data is here. That's for a polarisation of 80-30 for electrons-1 positrons+1.
These are the results I get. I'm currently playing around to see if changing the polarisation or luminosity changes anything.
Wednesday, February 11, 2009
Event selection with TMVA
I checked that the input variables are not affected by the initial polarisation. I also checked the correlations, and they remain the same. So one can use the samples with different polarisation with set of variables I am using for training with much to worry about.
I played a bit more with the parameters in TMVA and I finally was able to use the recoil mass as a discriminating variable replacing the energy of the dilepton (Z). I also added the remaining samples that I did not use last time for the training (now it is 2x more).
I got slightly better results, mostly from optimising parameters and using the recoil mass than from using more samples in the training. Again, the likelihood give the best result, compared with boosted decision trees and neural networks. Neural networks was also better than before by simply adjusting some parameters, namely the number of nodes in the hidden layers. Boosted decision trees method is sinister. I don't understand well that method, but it seems that one needs lots of events and lots of input variables as well. At least now, after adjusting the pruning of the trees, its output from the training and test samples are not so discrepant as before.
Previously with old likelihood (in number of events),
- signal : 3327 -> 2681 (efficiency 80.6%)
- background: 31132 -> 2722 (efficiency 8.7%)
Now with new likelihood + tuning (in number of events),
- signal : 3327 -> 2725 (efficiency 81.9%)
- background: 31132 -> 2386 (efficiency 7.7%)
I will soon try with Mark's variables to see if that improves.
Fit with finite MC - first attempt
My first attempt to fit with finite MC samples. Results here .
Still to check:
Still to check:
- How to treat bins with zero entries.
- Are errors calculated correctly?
- How to fix one parameter.
Subscribe to:
Posts (Atom)