I've been trying to understand the differences that Mark found between the results using SGV-Tesla and Mokka-LDCPrime_02Sc neural networks. I am pretty confident that the new neural networks is better than the old ones. Not only because other plots, such as the purity-efficiency, say they present better performance, but becasue the old ones are not correct as some overtraining can be spotted.
I wrote my conclusions on neural networks down in the web page here.
Friday, November 28, 2008
Thursday, November 27, 2008
Purity x Efficiency (Z-Higgs)
Here is the flavour tag purity-efficiency plot with Z-Higgs events comparing Mokka-LDCPrime_02Sc neural networks (open symbol) and SGV-Tesla neural networks (solid symbol).
The c-tag performance is similar, but slightly better for Mokka-LDCPrime_02Sc, for mid to high efficiencies. Mokka-LDCPrime_02Sc is much better for low efficiencies.
The c-tag performance is similar, but slightly better for Mokka-LDCPrime_02Sc, for mid to high efficiencies. Mokka-LDCPrime_02Sc is much better for low efficiencies.
Flavour Tagging results (so far)
I've created a new sample for LDCPrime_02Sc_p02 from the generator files at the ILC database (50,000 ZH->eeXX and about 62,000 ZZ->eeqq). I was looking at the flavour tag results using the LDCPrime_02Sc nets (note not "_p02") Roberval sent me, as well as his tuned JProb parameters and RPCutProcessor parameters. I didn't get very good results for that:
New parameters and new nets
To see if it was something to do with the new samples I then tried the old tagging:
Old parameters and old (SGV trained) nets
That was much improved, so to see if it was the new nets or parameters that were causing the problem, I tried redoing the new tag with the old nets (because it was the easiest thing to do):
New parameters and old nets
That was much better than both (although slightly smaller peak for the b sample). It looked like the b tag was okay using the new nets, so I wondered if it was any good using the new b-tag nets and the old c-tag nets, to see if it was actually just the c-tag that's a bit off:
New parameters and new b nets, old c nets
Although much better than everything done with the new nets, the b tag is still not as good as with the old nets.
New parameters and new nets
To see if it was something to do with the new samples I then tried the old tagging:
Old parameters and old (SGV trained) nets
That was much improved, so to see if it was the new nets or parameters that were causing the problem, I tried redoing the new tag with the old nets (because it was the easiest thing to do):
New parameters and old nets
That was much better than both (although slightly smaller peak for the b sample). It looked like the b tag was okay using the new nets, so I wondered if it was any good using the new b-tag nets and the old c-tag nets, to see if it was actually just the c-tag that's a bit off:
New parameters and new b nets, old c nets
Although much better than everything done with the new nets, the b tag is still not as good as with the old nets.
Wednesday, November 26, 2008
Branching Ratios
We are aimed to investigate the BR of Higgs decaying into bb, cc and gg for SM Higgs. (120GeV)
In our case all other decays are the background.
We separated all flavors (separation of Muons is done at first place) and make plots of BTag, CTag and BCTag, three plots at the bottom (Red for bb events, Blue for cc events and Black for light quark events).
Using the method in Kuhl-Desch paper, we split our 20,oo0 events into Data and Sample events in 1:1.
As an exercise, we tried to find the parameters for bb/gg and cc events by setting value of parameter for gg/bb and background equals to one. The likelihood function is then plotted as shown below.(Two plots on top, first one is for cc vs gg and second one is gg vs cc)
In order to get the fitted value of the parameters, we will minimize the likelihood function. And then using the fitted value of the parameters we will find the branching ratios.
In our case all other decays are the background.
We separated all flavors (separation of Muons is done at first place) and make plots of BTag, CTag and BCTag, three plots at the bottom (Red for bb events, Blue for cc events and Black for light quark events).
Using the method in Kuhl-Desch paper, we split our 20,oo0 events into Data and Sample events in 1:1.
As an exercise, we tried to find the parameters for bb/gg and cc events by setting value of parameter for gg/bb and background equals to one. The likelihood function is then plotted as shown below.(Two plots on top, first one is for cc vs gg and second one is gg vs cc)
In order to get the fitted value of the parameters, we will minimize the likelihood function. And then using the fitted value of the parameters we will find the branching ratios.
Thursday, November 6, 2008
Software issues
I've come across two software issues recently, both of which are fairly old but here's a bit more information:
Update 12:15
When going to post about the second point, I saw there was a post in the LCIO forum "stdehepjob fails on 64 bit". Basically, Roman Poeschl tracked down exactly where the error occurs and Jan Engels fixed it in the LCIO cvs on the 29th of Septemtber.
So to fix Mokka crashing when reading stdhep on a 64bit machine, you need the HEAD version of LCIO or a tag made after the 29th of September.
- Mokka crashing when reading stdhep files - I've managed to fix this by changing a few things, but I'm now having trouble breaking it again to see exactly which one of those things fixed it. Mokka uses the stdhep reader from LCIO, but I'm not entirely sure if that uses CLHEP or not (I don't think it does). I used the HEAD version of CLHEP, Mokka and LCIO; and the most recent tag of Geant4. I also compiled LCIO with CLHEP, just in case the reader does use CLHEP. After all of that, Mokka can read stdhep.
- LCIO can't read files with Pandora clusters and CalorimeterHits in them - I came across this ages ago, but thought it was because the file size was too large. If you do try and read one of these files, you get the error
"*** glibc detected *** free(): invalid pointer: 0x0000000000b629e0 ***"
After reading in the first event I think LCIO is trying to delete the CalorimeterHits twice, once for the normal collection and once for the hits associated to the Pandora Clusters. I don't know if this is a problem with LCIO or with Pandora, I'll put a post on the forums.
Update 12:15
When going to post about the second point, I saw there was a post in the LCIO forum "stdehepjob fails on 64 bit". Basically, Roman Poeschl tracked down exactly where the error occurs and Jan Engels fixed it in the LCIO cvs on the 29th of Septemtber.
So to fix Mokka crashing when reading stdhep on a 64bit machine, you need the HEAD version of LCIO or a tag made after the 29th of September.
Thursday, October 23, 2008
Muon indentification with TMVA
As I said in the meeting this morning, I am learning how to use the TMVA (Toolkit for Multivariate Data Analysis with ROOT) package to improve the signal-background separation.
My learning process started using the single-particle samples that Hajrah used to obtain selection cuts for the muon identification. I used the same distributions as the input discriminating variables into four different methods: CutsGA (rectangular cuts), Likelihood, MLP (neural networks) and BDT (boosted decision trees).
All methods can give about 99.6% efficiency at the maximum S/√(S+B), i.e. at the optimal background rejection. Previous cuts give 97.5% efficiency with similar purity.
Some nice output plots from the package can be seen in my wiki page .
My learning process started using the single-particle samples that Hajrah used to obtain selection cuts for the muon identification. I used the same distributions as the input discriminating variables into four different methods: CutsGA (rectangular cuts), Likelihood, MLP (neural networks) and BDT (boosted decision trees).
All methods can give about 99.6% efficiency at the maximum S/√(S+B), i.e. at the optimal background rejection. Previous cuts give 97.5% efficiency with similar purity.
Some nice output plots from the package can be seen in my wiki page .
Thursday, October 2, 2008
Z-Higgs samples
Some numbers and comments for the discussion of the samples we need can be found here.
Thursday, September 25, 2008
pandora-pythia FSR
FYI
I just found out that pandora-pythia does not consider QED final-state radiation, i.e., final state electrons and muons do not emit photons.
I just found out that pandora-pythia does not consider QED final-state radiation, i.e., final state electrons and muons do not emit photons.
Thursday, September 18, 2008
Background samples
According to the Kuhl and Desch paper the following processes contribute to the background of ZH:
- e+e- → Z/γ → qq(γ)
- e+e- → WW → qqqq
- e+e- → WW → qqlν
- e+e- → ZZ → qqqq
- e+e- → ZZ → qql+l-
- e+e- → ZZ → qqνν
- e+e- → Weν → qqeν
- e+e- → Ze+e- → qqe+e-
But not only some of them will contribute to the leptonic channel as one can see on page 12. After the channel classification, i.e. 2 electrons or 2 muons with momentum > 15 GeV, we need essentially the processes:
- e+e- → ZZ → qql+l-
- e+e- → Ze+e- → qqe+e-
- e+e- → WW → qqlν
The list is ordered according with the contribution in the Kuhl-Desch paper. The process number 3 contributes less but has the largest cross section. If we need that sample (the cuts essentially remove this contribution) we should generate for the nominal luminosity only to save disk space and compare with the scaled signal and the Z background.
Concerning sample 2, I was wondering why the process e+e- → Zμ+μ- → qqμ+μ- was not considered.
For samples 1 and 2, as well as for the signal, we may need samples generated with luminosity of about 4 ab-1.
Another thing in Kuhl-Desch paper that was not clear for me if the meaning of the symbol q. On page 12, table 7 the column with qq says 2 fermions. Would these qq pairs be tau pais for example?
Wednesday, August 20, 2008
Neutrino effect on dijet mass
I still did not cross check the effect of neutrinos in the dijet mass yet. In confirming so, we can take advantage of that to separate the ZH events from the ZZ background. The ZH events for dijet mass values below about 110 GeV should have more missing energy than the ZZ events with dijet mass above about 92 GeV. This range, 92 < dijet mass < 110, contains most of the overlapping between the two processes. Then instead of cutting on the dijet mass at 96 GeV we cut on the missing ET-dijet mass plane. By just applying the cut, obtained "by eye", shown in the plot 10% of the signal was recovered whereas 10% more background events were cut when compared with the case where just a dijet mass cut is applied (see previous post).
If time allows I will think of a smart way to perform the background separation using the information from the plot below. But the idea is there.
If time allows I will think of a smart way to perform the background separation using the information from the plot below. But the idea is there.
Monday, August 18, 2008
Neutrino effect on the di-jet mass
Cheated assignment with and without Monte Carlo neutrinos
This plot shows the reconstructed di-jet mass, but using the reconstructed particles chosen from Monte Carlo. There's also the same thing but with the Monte Carlo neutrinos taken into account, which shows the low Higgs tails are clearly due to neutrinos.
This plot shows the reconstructed di-jet mass, but using the reconstructed particles chosen from Monte Carlo. There's also the same thing but with the Monte Carlo neutrinos taken into account, which shows the low Higgs tails are clearly due to neutrinos.
Friday, August 15, 2008
Minutes of Meeting 14/8/08
Present: Roberval, Clare, Mark, Joel
- Roberval showed plots of his studies on the 230 GeV sample of Z->µµ+H, some of which have been discussed before. There was discussion of a dijet mass cut vs the recoil mass cut, but the dijet mass from the Higgs shows a long tail at lower values. Using a dijet mass plus Z energy cut does give very good background rejection for a signal efficiency of about 1/3.
- Mark's plots from Monday for the 250 GeV Z->ee samples are similar, but have been showing a high tail in the dijet mass spectrum (particularly in the ZZ).
- After some further investigation and discussion in the meeting, it was decided that:
- The long low-mass tail in the Higgs dijet mass plots is probably neutrinos. Mark and Roberval should check this by adding
in neutrinos from the MC - The high-mass tail in the electron samples is mainly due to photons (primarily FSR and Brehmsstrahlung) being included in the jets. Mark is looking at using a smaller Ycut to isolate photons (and taus) and then veto any jet with less than two tracks, as shown in his latest plots. He has also looked at picking out photons close to primary electrons.
- The long low-mass tail in the Higgs dijet mass plots is probably neutrinos. Mark and Roberval should check this by adding
- There was some confusion about whether we are expected to give a status report in the ILD meeting next week [it's actually the week after]. Since Victoria gave the last talk, it was decided that someone from Bristol (Mark) should give the next one.
Questions about beamstrahlung and ISR
I've been reading about initial state radiation (ISR) and beamstrahlung and I am not sure whether one can measure the photon or not. Some say that it is possible for ISR, but I understand that in most of the events the photon(s) would be outside the detector acceptance. Besides that the calculations of ISR that I know are done considering the electrons as having some sort of structure.
For beamstrahlung it is said it could be possible to measure its effects, using for example the lumi detector, considering the effect as a collective effect of the particles of the beam. But the nature of the effect is statistical. What if the particle participating in the hard scattering were affected? Moreover, no photon should be possible to be measured because the radiation is emitted as the beam moves along the tunnel.
Does anybody understand these issues better? I believe that understanding this things better we could try to do something to improve the measurements in a long term.
For beamstrahlung it is said it could be possible to measure its effects, using for example the lumi detector, considering the effect as a collective effect of the particles of the beam. But the nature of the effect is statistical. What if the particle participating in the hard scattering were affected? Moreover, no photon should be possible to be measured because the radiation is emitted as the beam moves along the tunnel.
Does anybody understand these issues better? I believe that understanding this things better we could try to do something to improve the measurements in a long term.
Thursday, August 14, 2008
Plots for meeting
Yet again I've left it a bit late to describe these here, I'll talk about them in the meeting and then write something here later.
Mass plot for ycut jet finding
We had a go at selecting the hadronic stuff by using ycut jet finding with a low ycut and requiring the jets to have at least 2 tracks. This is the mass plot for the best peak I get.
Comparison
This is that plot overlaid on the perfect particle assignment from Monte Carlo, and the plot from removing photons close to the electrons. The ycut method has a slightly smaller peak than the brem removal method, but smaller high tail which is the important thing.
Mass plot for ycut jet finding
We had a go at selecting the hadronic stuff by using ycut jet finding with a low ycut and requiring the jets to have at least 2 tracks. This is the mass plot for the best peak I get.
Comparison
This is that plot overlaid on the perfect particle assignment from Monte Carlo, and the plot from removing photons close to the electrons. The ycut method has a slightly smaller peak than the brem removal method, but smaller high tail which is the important thing.
Status of the analysis
A presentation with the status of what I am doing for the analysis can be found here.
Monday, August 11, 2008
Dijet mass tail: Pandora x Pythia
I got a sample generated with pythia from the grid to compare with my Pandora-Pythia samples. Both shows long tails towards negative dijet mass values. I used 2000 events from each sample.
Improving the mass plots
When checking the performance of the precuts we found that the jet mass cut was cutting very little of the ZZ background out; there should be similar numbers of ZH and ZZ going into the likelihood cut but we had about 4 times as much ZZ. The mass plot (here, for cheated electron identification) is quite wide and has a long high tail on the ZZ.
We had a look at several things, like trying ycut jet finding in case it was initial state radiation photons being lumped in as well when forcing to 2 jets. Nothing conclusive came up so I looked directly at a few events with the di-jet mass over 200 GeV. They all seemed to have highly energetic photons from the Z (the one that goes to electrons) or from the electrons themselves lumped into the jets. I had a go at cutting out any photons (as identified by Pandora) that are within 4 degrees of the primary electrons and came up with this. Here's a plot of the energy and number of these photons.
After checking that I thought I'd try and find out if there was anything else being thrown in that shouldn't. From the Monte Carlo to reconstructed particle relations I listed all the reconstructed particles that come from each Z (for ZZ; or Z and H for ZH) separately and plotted the invariant mass for these. That's basically the best you could ever get without improving the particle flow/tracking/etcetera. Here's a plot of the 3 methods over laid for the ZZ sample - original cheated electron ID; bremsstrahlung removed and completely cheated jet association. Removing the bremsstrahlung photons does a pretty good job, but there's still something going into the jets that shouldn't. Here's some details about the particles that are left over when assigning to the Z or the H (or the Zs for the ZZ sample) from Monte Carlo.
Next steps:
We had a look at several things, like trying ycut jet finding in case it was initial state radiation photons being lumped in as well when forcing to 2 jets. Nothing conclusive came up so I looked directly at a few events with the di-jet mass over 200 GeV. They all seemed to have highly energetic photons from the Z (the one that goes to electrons) or from the electrons themselves lumped into the jets. I had a go at cutting out any photons (as identified by Pandora) that are within 4 degrees of the primary electrons and came up with this. Here's a plot of the energy and number of these photons.
After checking that I thought I'd try and find out if there was anything else being thrown in that shouldn't. From the Monte Carlo to reconstructed particle relations I listed all the reconstructed particles that come from each Z (for ZZ; or Z and H for ZH) separately and plotted the invariant mass for these. That's basically the best you could ever get without improving the particle flow/tracking/etcetera. Here's a plot of the 3 methods over laid for the ZZ sample - original cheated electron ID; bremsstrahlung removed and completely cheated jet association. Removing the bremsstrahlung photons does a pretty good job, but there's still something going into the jets that shouldn't. Here's some details about the particles that are left over when assigning to the Z or the H (or the Zs for the ZZ sample) from Monte Carlo.
Next steps:
- Find out what these left over particles are.
- See how the mass plots look using realistic electron finding and removing the bremsstrahlung.
- See how many ZZ and ZH get through the precuts now.
Friday, August 1, 2008
Minutes of Meeting 28/7/08
Present: Hajrah, Robervaal, Victoria, Clare, Mark, Joel (i.e. everyone)
- Clare and Mark have been having another look at electron ID. When Mark looked a few months ago, the default efficiencies looked rather poor due to close brehmsstrahlungs and separated parts of the EM shower so he has been using MC information to cheat since. Now things look much better: using either Pandora defaults or a Kuhl-style simple ID (one cluster matched to one track, with cuts on track isolation and EM vs Hadronic energy) efficiencies above 90% are reached. Further tuning will be done by Bristol MSci students.
- Robervaal has been looking at samples of ZH with Z->µµ and H->anything at √s = 230 and 250 GeV (20,000 events in each). He has identified cutting on the Z recoil mass as an extremely promising way of reducing the ZZ background.
- It seems clear that the best way to perform both electron and muon analyses is to identify the leptons and then remove them before jet clustering.
- Victoria has been looking at the B and C-tagging performance in jets. The C-tag does not look very good and probably needs re-tuning. The B/C tag is not relevant for this analysis.
- Hajrah checked the electron ID cuts in the new detector model. The EM/total and E/p cuts need to be changed. This was done in single particle simulations, so she will next check in physics event.
- Mark had some technical problems getting all of his plots on the web and will put them up after the meeting. The basic summary was that the flavour tag likeliness plots look good and we are probably ready to fit the templates, the last step in the analysis chain.
It was decided that we have enough to show in Wednesday's ILD meeting, and that Victoria will put together a talk.
ycut versus njet
I had a quick look into the ycut versus njet. You can see plots and conclusions here. It seems fine to use the njet mode instead of ycut to reconstruct the higgs from jets, if necessary.
What I am not sure is whether it is fine to use the 2-jet mode for vertexing and flavour tagging. At least the ZVKIN algorithm uses the jet axis as a starting direction for the ghost track. How the algorithm behaves when the starting direction is far away needs to be investigated.
Thursday, July 31, 2008
Dijet mass
Below is the dijet mass distribution. No cuts on the energy of the Z was done, and the reconstructed Z from muons have mass between 80 GeV and 102 GeV.
The jets were obtaining by reclustering after muon removal from the list of particles and forcing into 2 jets. I just don't understand why such long tails for low mass values. Still investigating. Improving the dijet mass resolution (by improving jet reconstruction) could save some events from the dijet mass cut.
Event selection tuning and Luminosity
The plots I've been showing must be looked at carefully. The selection of events I am doing up to now are "non-standard" and not properly tuned. I still have to check the standard cuts for ZH event selection, together with my cuts, in order maximise the statistics.
One also have to have in mind that the number of events in the samples corresponds to 6.6 the nominal luminosity (L), where L=500fb-1. So, divide the Y axis by 6.6 to get the "correct" number of events. A fine tuning of the selection is very important to get enough statistics of H into c cbar.
Removing further ZZ background
As I showed before the cut on the energy of the Z can eliminate a fraction of the ZZ background (see plot in previous post below). I am trying now other cuts that could improve further the removal of ZZ background.
I am applying the same cuts up to now:
- M_Z between 88 GeV and 94 GeV
- E_Z between 100 GeV and 103 GeV
The plot below shows the dijet mass (jets were reclustered into 2 jets after the muons coming from a Z boson are removed). One can see that the signal and the background are quite distinct.
Applying a cut on the dijet mass, Mjj .gt. 96 GeV, the ZZ background was further reduced.
Wednesday, July 30, 2008
ZZ versus ZH kinematics
I generated+simulated+reconstructed some ZZ samples with centre of mass energy = 230GeV. The files are in DST format. Soon I will upload them to the grid.
From kinematics the energy of the Z bosons in ZZ process in the centre of mass frame is sqrt(s)/2. But because of the different masses of the Z and the Higgs bosons, the energy of the Z boson should be (s-mZ^2-mH^2)/(2*sqrt(s)). We can see that from the plot of the energy of the Z boson below. The blue line was obtained using ZZ+ZH samples whereas the dashed black is from ZH only. Cuts on the energy of the Z may be worth doing to remove also the ZZ background.
The next plot shows the recoil mass. The different cuts are in the plot (problems using math symbols here).
Monday, July 28, 2008
Some plots
What I've been doing is ID-ing the electrons and then forcing everything else to 2 jets. This means I'd have to do my own jet finding and flavour tagging if using mass reconstruction files in the future.
Variables used in the likelihood cut
The thrust variables don't seem to do much, and the jet energy difference doesn't offer great discrimination. We still need to add the di-jet mass after the 5 constraint fit to this.
Flavour tag likeness for the Higgs sample (using the c-tag)
These plots look fairly good; there is a definite difference between each different plot that should be extractable from the branching ration fit.
Flavour tag likeness for the ZZ sample (both c-tag and bc-tag)
Same here for the ZZ. I've also added the same plot but using the c-with-only-b-background tag, although this doesn't show much (see the next set of plots). There also doesn't seem to be much tagged as c in the c-tag version. I'll have to look into that.
Flavour tag likeness for the Higgs sample (using the bc-tag)
The bc-tag just seems to tag anything that's not a b as a c (not really very surprising since that's what it's trained to do). All the templates (except the b) look pretty much the same, so not very good for fitting to.
Electron ID comparison
The next four plots compare the different electron identification methods. Everything is using a Z Higgs sample with the Z going to two electrons and the Higgs doing whatever it fancies, so there is some neutrino stuff in there as well from e.g. Higgs->WW. There's a total of 43888 events split according to standard model 120GeV Higgs branching ratios.
Note that anything labelled as "Pandora" is using a head version of PandoraPFA from a couple of weeks ago. A new tag has recently been released which claims to have improved electron ID, I'll try and redo the plots with the new version soon.
Di-jet mass of everything not identified as an electron forced to 2 jets
There are no cuts on any of these plots, so some of the high tails should be reduced if I cut out anything where less than 2 electrons are found (see below).
Di-electron mass of everything identified as an electron
I've called this "di-electron" mass but quite often there are more than 2 electrons involved, see the next plot for the numbers. Note that I forgot to put a cut on the number found, so the spikes at zero are when 1 or no electrons are identified.
Numbers of electrons found
The cheated electron code takes the Monte Carlo Z daughters and uses the LCRelation collection to match them to reconstructed particles. If more than one reconstructed particle is related to a particular electron then they are all lumped into one "jet" and the combined four momentum used for the cheated electron. As such I have no idea how it can find more than 2 (let alone 6?). I'll have to look into it when I get time. Addendum 06/Aug/08 - The Higgs sample has Higgs to ZZ, hence the possibility of 4 or 6 electrons from Zs. Doh!
Recoil mass using the electrons found
For this the initial four momentum was assumed to be a consistent (250,0,0,0); so you can clearly see the beam effects screwing up the plot. Have a look at one of my earlier posts to see how I simulated beamstrahlung; basically it's just a rough (over)estimate using 500GeV beam parameters.
Variables used in the likelihood cut
The thrust variables don't seem to do much, and the jet energy difference doesn't offer great discrimination. We still need to add the di-jet mass after the 5 constraint fit to this.
Flavour tag likeness for the Higgs sample (using the c-tag)
These plots look fairly good; there is a definite difference between each different plot that should be extractable from the branching ration fit.
Flavour tag likeness for the ZZ sample (both c-tag and bc-tag)
Same here for the ZZ. I've also added the same plot but using the c-with-only-b-background tag, although this doesn't show much (see the next set of plots). There also doesn't seem to be much tagged as c in the c-tag version. I'll have to look into that.
Flavour tag likeness for the Higgs sample (using the bc-tag)
The bc-tag just seems to tag anything that's not a b as a c (not really very surprising since that's what it's trained to do). All the templates (except the b) look pretty much the same, so not very good for fitting to.
Electron ID comparison
The next four plots compare the different electron identification methods. Everything is using a Z Higgs sample with the Z going to two electrons and the Higgs doing whatever it fancies, so there is some neutrino stuff in there as well from e.g. Higgs->WW. There's a total of 43888 events split according to standard model 120GeV Higgs branching ratios.
Note that anything labelled as "Pandora" is using a head version of PandoraPFA from a couple of weeks ago. A new tag has recently been released which claims to have improved electron ID, I'll try and redo the plots with the new version soon.
Di-jet mass of everything not identified as an electron forced to 2 jets
There are no cuts on any of these plots, so some of the high tails should be reduced if I cut out anything where less than 2 electrons are found (see below).
Di-electron mass of everything identified as an electron
I've called this "di-electron" mass but quite often there are more than 2 electrons involved, see the next plot for the numbers. Note that I forgot to put a cut on the number found, so the spikes at zero are when 1 or no electrons are identified.
Numbers of electrons found
The cheated electron code takes the Monte Carlo Z daughters and uses the LCRelation collection to match them to reconstructed particles. If more than one reconstructed particle is related to a particular electron then they are all lumped into one "jet" and the combined four momentum used for the cheated electron. As such I have no idea how it can find more than 2 (let alone 6?). I'll have to look into it when I get time. Addendum 06/Aug/08 - The Higgs sample has Higgs to ZZ, hence the possibility of 4 or 6 electrons from Zs. Doh!
Recoil mass using the electrons found
For this the initial four momentum was assumed to be a consistent (250,0,0,0); so you can clearly see the beam effects screwing up the plot. Have a look at one of my earlier posts to see how I simulated beamstrahlung; basically it's just a rough (over)estimate using 500GeV beam parameters.
Electron cut efficiency
The change due to new detector model is seen in the ratio of Ecal energy and the total energy, due to which we have changed our electron cuts. Now it is
1) Ecal/ETot = 0.9
2) ETot/p = 0.7
So with the new detector model, above cuts and more statistics, the efficiency for momentum and cos theta is here
Monday, July 21, 2008
Useful LCIO commands
Some time ago, Mark had problems with a large lcio file. Now I have the same problem. I found on the web the command
lcio split -i file.slcio -n 1000
that splits the file in chunks of 1000 events.
For more LCIO comands, see here.
Monday, July 14, 2008
z-hisggs preliminary results (2)
Efficiencies:
- Ecm = 230 GeV: 17958 Z boson candidates reconstructed out of 20000 generated. (90%)
- Ecm = 250 GeV: 18191 Z boson candidates reconstructed out of 20000 generated. (91%)
z-higgs preliminary results
Here are some plots produced using the muon selection from Hajrah to tag the muons in the PandoraPFOs collection.
Additional cuts: momentum of the lepton from Z greater than 20 GeV; mass of the Z boson between 70 and 110 GeV.
Recoil mass reconstructed using the nominal centre of mass energy.
The black and the red lines on the plots correspond to Ecm = 230 GeV and Ecm = 250 GeV, respectively. The samples used are described here.
Additional cuts: momentum of the lepton from Z greater than 20 GeV; mass of the Z boson between 70 and 110 GeV.
Recoil mass reconstructed using the nominal centre of mass energy.
The black and the red lines on the plots correspond to Ecm = 230 GeV and Ecm = 250 GeV, respectively. The samples used are described here.
Efficiency Plots with more statistics
After applying the Muon cuts, the efficiency plots for Momentum and Costheta are shown below.
Please follow the link here:
https://www.wiki.ed.ac.uk/display/ILC/Hajrah+Tabassam
Please follow the link here:
https://www.wiki.ed.ac.uk/display/ILC/Hajrah+Tabassam
Edinburgh z-higgs samples
Edinburgh Z-higgs samples, e+e- -> Z h
Z -> mu+ mu-
h -> anything
The samples are located on the GRID at
/grid/ilc/users/edinburgh/data/z-higgs
Up to now there are 2 samples,
- Ecm = 230 GeV,
- Ecm = 250 GeV,
with 20000 events each. Both reconstructed (still being uploaded) and DST files are available.
More details can be found here.
Z -> mu+ mu-
h -> anything
The samples are located on the GRID at
/grid/ilc/users/edinburgh/data/z-higgs
Up to now there are 2 samples,
- Ecm = 230 GeV,
- Ecm = 250 GeV,
with 20000 events each. Both reconstructed (still being uploaded) and DST files are available.
More details can be found here.
Friday, July 11, 2008
Sample grid locations
I've copied most of my samples to the grid, and started a script running to copy the rest. Everything is under the directory:
With currently three directories:
The files should be self descriptive from the filename. I was going to write how many events etcetera there are here but I've rather stupidly left my log book at home, and I'm away from tonight. There are only 10000 WW events (in 2 files), other than that everything should be whatever the numbers are for 500 fb^-1. Higgs to bbbar is split over 4 files, and the ZZ stuff is split over 100 files.
The detector is LDC01_05Sc at 250GeV, but with SEcal03 instead of SEcal02 (there was a problem with SEcal02, I can't remember the details). Beam effects were simulated by forcing the centre of mass energy in Pythia to match the energy spectrum from PandoraPythia (with the 500GeV beam parameters). That's about all I can remember for now, I'll check my log book when I get back.
/grid/ilc/users/phmag/LCIOOutput/ZHiggsAnalysis
With currently three directories:
WW-anything
ZH-eeXX
ZZ-eeJetJet
The files should be self descriptive from the filename. I was going to write how many events etcetera there are here but I've rather stupidly left my log book at home, and I'm away from tonight. There are only 10000 WW events (in 2 files), other than that everything should be whatever the numbers are for 500 fb^-1. Higgs to bbbar is split over 4 files, and the ZZ stuff is split over 100 files.
The detector is LDC01_05Sc at 250GeV, but with SEcal03 instead of SEcal02 (there was a problem with SEcal02, I can't remember the details). Beam effects were simulated by forcing the centre of mass energy in Pythia to match the energy spectrum from PandoraPythia (with the 500GeV beam parameters). That's about all I can remember for now, I'll check my log book when I get back.
Monday, July 7, 2008
Installing OpenScientist
OpenScientist is a full C++ implementation of AIDA. I was having big problems using Java on 64 bit machines reliably so switched to this instead of AIDAJNI.
To install, first download from openscientist.lal.in2p3.fr. I went for the "batch" version, which is basically the simple version without any GUI stuff for batch processing.
The installation instructions weren't particularly to the point so I'll recap what I did briefly (obviously change the version number if appropriate):
I think that builds everything (there is more to OpenScientist than just an AIDA implementation), but it's easier than figuring out which bits you actually need.
To use it with the lcio software you need a CMake module. I wrote one with all the library and include directories hard coded into it; it would be nice if I wrote a proper self-detecting one but I haven't had the time. You can find out the compiler flags for AIDA with the utility "<location>/OpenScientist/v16r4/osc_batch/v16r4/bin_obuild/osc_batch/v16r4/bin/aida-config".
My module is here, obviously you'll need to change the directories to match your system. I called the module FindBatchLab.cmake, so then all you have to do is put that module in with the other CMake modules and change the dependencies in your projects CMakeLists.txt file from "AIDAJNI" to "BatchLab".
IMPORTANT NOTE FOR JAS USERS - BatchLab puts empty histograms/clouds etcetera into the AIDA file, which JAS doesn't like (e.g. if you create a histogram and didn't fill it with anything). I modified the file "OpenScientist/v16r4/BatchLab/v1r2p1/source/XML/XML_DataWriter.cxx" to this. Copy my file over the original before building if you want to use JAS to analyse your AIDA files. The xml schema says there should be at least one entry for clouds/histograms/etc, so I think the JAS way is the correct way.
To install, first download from openscientist.lal.in2p3.fr. I went for the "batch" version, which is basically the simple version without any GUI stuff for batch processing.
The installation instructions weren't particularly to the point so I'll recap what I did briefly (obviously change the version number if appropriate):
unzip osc_batch_source_v16r4.zip
cd OpenScientist/v16r4/osc_batch/v16r4/obuild
source setup.sh
./sh/build
./sh/build_release
I think that builds everything (there is more to OpenScientist than just an AIDA implementation), but it's easier than figuring out which bits you actually need.
To use it with the lcio software you need a CMake module. I wrote one with all the library and include directories hard coded into it; it would be nice if I wrote a proper self-detecting one but I haven't had the time. You can find out the compiler flags for AIDA with the utility "<location>/OpenScientist/v16r4/osc_batch/v16r4/bin_obuild/osc_batch/v16r4/bin/aida-config".
My module is here, obviously you'll need to change the directories to match your system. I called the module FindBatchLab.cmake, so then all you have to do is put that module in with the other CMake modules and change the dependencies in your projects CMakeLists.txt file from "AIDAJNI" to "BatchLab".
IMPORTANT NOTE FOR JAS USERS - BatchLab puts empty histograms/clouds etcetera into the AIDA file, which JAS doesn't like (e.g. if you create a histogram and didn't fill it with anything). I modified the file "OpenScientist/v16r4/BatchLab/v1r2p1/source/XML/XML_DataWriter.cxx" to this. Copy my file over the original before building if you want to use JAS to analyse your AIDA files. The xml schema says there should be at least one entry for clouds/histograms/etc, so I think the JAS way is the correct way.
Tuesday, July 1, 2008
Minutes of Meeting 30/6/08
Present: Hajrah, Roberval, Victoria, Mark, Joel (Clare is still in a muddy field)
θ
- Mark's samples for 500 fb-1 of ZH at 250 GeV (mh = 120 GeV, Z->ee) have finished running through simulation and he is checking them. He used LDC01_05sc. The samples will be made accessible on the Grid and a post made to this blog. Z->µµ samples can also be made.
- Clare has been working on getting a working suite of software on the Bristol 64-bit machine (Roberval has successfully compiled the new version of MARLIN on the Edinburgh machine in 32-bit mode)
- Roberval has been writing a processor to use the muon ID cuts to tag RPOs
- Hajrah is looking at electron and muon cut ID efficiency as a function of θ and momentum, but needs bigger samples
- Victoria only has a few more weeks before her leave commences
- Mark has been encountering problems reading in large (~10 GByte) LCIO files. No-one knows of any physical limit, but he will split the files up in future.
θ
Thursday, May 15, 2008
Monday, April 21, 2008
Muons from Z (erratum)
In my sample there was 4000 events, so there should be 8000 muons. But in the previous plot there was 8270 muons. Checking my codes and the sample I found out that the "extra muons" were daughters from Z coming from the Higgs descendant line, e.g., H -> ZZ*. The correct plot is the below.
Muons from Z
I generated a sample of Z-Higgs events, at 250 GeV. The plot shows the distribution of the momentum of the muons coming from the Z.
Tuesday, April 8, 2008
Z Higgs talk at ILD meeting
There was an interesting talk on the Z Higgs channel at last weeks ILD detector optimisation meeting
z higgs talk
z higgs talk
Wednesday, March 12, 2008
JAIDA & AIDAJNI on 64-bit
I wrote some documentation on how I installed JAIDA (here) and AIDAJNI (here) in our cluster of 64-bit machines.
Another thing is: do not forget to include AIDAJNI on the list in the CMakeLists.txt:
SET( ${PROJECT_NAME}_DEPENDS "Marlin LCIO GEAR AIDAJNI ROOT" )
If you did not do that, you must go to the build dir, make clean, delete the file CMakeCache.txt, run cmake, and do make install.
Note: Changing only in the ilcinstall python script, assuming you are using ilcinstall, to link to AIDAJNI does not work.
Another thing is: do not forget to include AIDAJNI on the list in the CMakeLists.txt:
SET( ${PROJECT_NAME}_DEPENDS "Marlin LCIO GEAR AIDAJNI ROOT" )
If you did not do that, you must go to the build dir, make clean, delete the file CMakeCache.txt, run cmake, and do make install.
Note: Changing only in the ilcinstall python script, assuming you are using ilcinstall, to link to AIDAJNI does not work.
Monday, March 10, 2008
Plots for the meeting
I've been having a look at event selection; the Thorsten-Kuhl paper starts off with a cut based selection and then uses a likelihood selection. For the cut based selection they used:
For the likelihood selection I've been looking at the following inputs:
This plot shows the efficiency purity for the two methods (efficiency on x-axis). Note that because I've not got particularly large samples this plot was made with the same events as the reference distributions. A bit shoddy but if I can't get it to work like that then I may as well give up now. There are 6375 ZZ events in the background, and 5864 signal (slightly more c cbar than b bbar).
- Two leptons found (electron identification is performed before this). At the moment I'm cheating the electron identification from Monte Carlo because the electron ID was getting messy due to the extra reconstructed neutrals.
- The invariant mass of the two leptons has to be between 70 GeV and 110 GeV.
- Cos(theta) of the event's primary thrust axis less than 0.9.
- The invariant mass of the two hadronic jets has to be greater than 100 GeV.
For the likelihood selection I've been looking at the following inputs:
- Cos(theta) of the event's primary thrust axis.
- The event's thrust.
- The invariant mass of the two hadronic jets.
- The energy difference of the two hadronic jets.
This plot shows the efficiency purity for the two methods (efficiency on x-axis). Note that because I've not got particularly large samples this plot was made with the same events as the reference distributions. A bit shoddy but if I can't get it to work like that then I may as well give up now. There are 6375 ZZ events in the background, and 5864 signal (slightly more c cbar than b bbar).
Tuesday, February 26, 2008
LCFIAIDAPlotProcessor crash
A crash will happen running Marlin with the LCFIAIDAPlotProcessor for the LCFIVertex software release v00-02-02. That happens because the variable ptmasscorrection should have been changed to ptcorrectedmass. I corrected that in the HEAD version, but if you don't want to apply any other modifications you can download the LCFIAIDAPlotProcesso.cc here and recompile LCFIVertex.
Convert stdhep into HEPEvt
Here are instructions and codes to convert an .stdhep file into a .HEPEvt. Two files are needed:
The first file is the FORTRAN code used for the conversion. It outputs a file called output.HEPEvt. The second file must contain the name of your input stdhep file. To compile the code you need the stdhep include directory and the libraries libstdhep.a and libFmcfio.a. You can find them as part of the old cernlib versions, like 2002. You then compile with the commands (change the location according to your setup):
The first file is the FORTRAN code used for the conversion. It outputs a file called output.HEPEvt. The second file must contain the name of your input stdhep file. To compile the code you need the stdhep include directory and the libraries libstdhep.a and libFmcfio.a. You can find them as part of the old cernlib versions, like 2002. You then compile with the commands (change the location according to your setup):
g77 -I$CERN/2002/include/stdhep -fno-second-underscore -c std2evt.FI also made a script with those commands. After compiling, an executable calles std2evt is created. Modify the file stdhep_file_name with the name of your stdhep file and run:
g77 -fno-second-underscore -o std2evt std2evt.o $CERN/2002/lib/libstdhep.a $CERN/2002/lib/libFmcfio.a -lnsl
./std2evtSome output may be redirect to the screen but don't worry. The converted output file in called output.HEPEvt.
Monday, February 25, 2008
Just for fun.
A friend sent me this link, and us geeks may appreciate it: http://stephenhicks.org/images/UniverseScale.gif
Minutes of meeting on 25th February
Present: Joel, Clare, Ryan, Hajrah, Roberval, Victoria.
Hajrah presented her plots (linked from the blog) showing the difference at the digitisation level between 55 GeV muons and 55 GeV pions. The energy of the muons probably doesn't make any sense. It uses the Muon digitiser from the head version of Marlin. Joel commented that it would be good to look at E_EM+E_HAD. Hajrah has not looked yet at tracking and clustering information, for this it would be best to use PandoraPFA, which is currently not working. Clare said that, after contact with Mark Thompson, she is using the head of PandoraPFA, which seems to work. Roberval will try to install that in Edinburgh, for Hajrah to use.
Clare had been installing the ILC software on a local 64bit Bristol machine. (The Grid is being abandoned!) Roberval has already done this in Edinburgh and Roberval and Clare will communicate as how to do this. Roberval noted that Marlin/Mokka run about 3 times faster on the 64bit machine in Edinburgh. We will encourage Roberval and Clare to co-author an ILC note on this subject.
Hajrah presented her plots (linked from the blog) showing the difference at the digitisation level between 55 GeV muons and 55 GeV pions. The energy of the muons probably doesn't make any sense. It uses the Muon digitiser from the head version of Marlin. Joel commented that it would be good to look at E_EM+E_HAD. Hajrah has not looked yet at tracking and clustering information, for this it would be best to use PandoraPFA, which is currently not working. Clare said that, after contact with Mark Thompson, she is using the head of PandoraPFA, which seems to work. Roberval will try to install that in Edinburgh, for Hajrah to use.
Clare had been installing the ILC software on a local 64bit Bristol machine. (The Grid is being abandoned!) Roberval has already done this in Edinburgh and Roberval and Clare will communicate as how to do this. Roberval noted that Marlin/Mokka run about 3 times faster on the 64bit machine in Edinburgh. We will encourage Roberval and Clare to co-author an ILC note on this subject.
Monday, February 18, 2008
New analysis timeline
With the funding cuts in the UK and USA (sob) there is a new time line for the physics studies. Instead of August, things are delayed by ~6 months, so I think the idea is to have a studey ready for publication by the end of 2008, beginning of 2009.
http://www.linearcollider.org/cms/?pid=1000498
http://www.linearcollider.org/cms/?pid=1000498
Saturday, February 16, 2008
ECal bugs in LDC01_05Sc and LDCPrime_01Sc
Just a note for those not on the ild-detector-optimisation mailing list.
There's some kind of problem with the electronic calorimeter end cap hits in the Mokka models LDC01_05Sc and LDCPrime_01Sc. I can't say I understand it, I think it's something to do with the cells not scaling properly. Anyway, there's a different model being worked on now and it was decided not to bring out an interim fixed model. Paulo sent around instructions for fixing it yourself though:
There's some kind of problem with the electronic calorimeter end cap hits in the Mokka models LDC01_05Sc and LDCPrime_01Sc. I can't say I understand it, I think it's something to do with the cells not scaling properly. Anyway, there's a different model being worked on now and it was decided not to bring out an interim fixed model. Paulo sent around instructions for fixing it yourself though:
Dear friends,
I guess that we decided to not touch the LDC01_05Sc and LDCPrime_01Sc models and to not create intermediate models. As the final model risks to take a while ;-) , here you are how to use LDC01_05Sc or LDCPrime_01Sc with the new Ecal (without the known bugs you have with the old one):
1) you have to checkout Mokka tag mokka-06-06-pre01 directly from our CVS HEAD (please follow the instructions here:
http://polzope.in2p3.fr/MOKKA/download)
2) , just add these two lines in your steering file:
/Mokka/init/EditGeometry/rmSubDetector SEcal02
/Mokka/init/EditGeometry/addSubDetector SEcal03 90
Cheers,
Paulo.
Thursday, February 14, 2008
Minutes of Meeting 11/2/08
Bristol: Clare, Ryan, Helen, Mark, Joel
Edinburgh: Hajrah, Robervaal, Victoria
Mark showed some plots that will form part of his talk in the general physics meeting tomorrow (see previous post). They show the y-cut that is required to ensure that all of the electron EM clusters are grouped together into a single jet. This is shown both for "undecayed" (i.e. no brehmsstrahlung within the tracking volume) and for all electrons. The y-cut value decreases with energy as expected, and is orders of magnitude smaller than values expected for hadron jets.
The final plot shows the distribution of electron track pT (green), total jet energy (orange) and single cluster energy (pink).
Hajrah has made an impressive start with Mokka/Marlin on muon ID. She is looking at the energy deposits and number of hits as muons traverse the calorimeters and will proceed to investigate the muon chambers.
The next meeting will be in two weeks, and they will remain fortnightly until further notice.
Monday, February 11, 2008
Muon id Presentation/ 11/02/08
My presentation is on this web page.
http://www.ph.ed.ac.uk/~vjm/ILC/Presentation.pdf
http://www.ph.ed.ac.uk/~vjm/ILC/Presentation.pdf
Minutes of meeting on 28th January
Sorry these are late!
In attendance: Victoria, Roberval, Hajrah, Joel, Clare, Ryan and Mark.
Mark is updating the framework for his analysis to make tagging variables and likelihood cuts (to separate data and background) a la Kuhl-Desch analysis. The likelihood cuts are based on thrust, visible energy and tagging probability.
Clare has generated at ZZ-->l+ l- q qbar events with Pthyia to make stdhep files. She's also been working on the Pandora calibration the LDC_01-05 geometry.
Together Clare and Mark have been generating electron and jet samples for the UG student. UG student has been AWOL, so Helen will help out here.
Clare and Mark agreed that they would read all the LDC emails that are being circulated to check that we are conforming to the official analysis for the LCF CDR.
Roberval has been making a comparison of the results obtained with SimpleCaloDigi and MokkaCaloDigi. MokkaCaloDigi is better, Clare and Roberval witll communicate on this.
Plan for upcoming work:
In attendance: Victoria, Roberval, Hajrah, Joel, Clare, Ryan and Mark.
Mark is updating the framework for his analysis to make tagging variables and likelihood cuts (to separate data and background) a la Kuhl-Desch analysis. The likelihood cuts are based on thrust, visible energy and tagging probability.
Clare has generated at ZZ-->l+ l- q qbar events with Pthyia to make stdhep files. She's also been working on the Pandora calibration the LDC_01-05 geometry.
Together Clare and Mark have been generating electron and jet samples for the UG student. UG student has been AWOL, so Helen will help out here.
Clare and Mark agreed that they would read all the LDC emails that are being circulated to check that we are conforming to the official analysis for the LCF CDR.
Roberval has been making a comparison of the results obtained with SimpleCaloDigi and MokkaCaloDigi. MokkaCaloDigi is better, Clare and Roberval witll communicate on this.
Plan for upcoming work:
Hajrah is starting to learn about Marlin and Mokka, and will start generating single muon events this week.
Mark will re-generate the signal at ECM=250 GeV
Thursday, February 7, 2008
CoM Energy
I just received an email from Ron Settles on the ILD optimisation list that included this:
---------------
--The optimum c.m.energy for doing Higgs measurements is
\sqrt{s}=m_Z+m_Higgs+ca.20GeV as has been pointed out many times recently
(and even during Lep days); I think most people have now changed to this
(the Snowmass05 energy was not optimum for m_Higgs=120GeV); it would be
best if all use the optimum energy to make comparisons/combinations
easier.
-----------------
I'm not sure what "ca" means? Does this agree with our CoM energy of 250?
---------------
--The optimum c.m.energy for doing Higgs measurements is
\sqrt{s}=m_Z+m_Higgs+ca.20GeV as has been pointed out many times recently
(and even during Lep days); I think most people have now changed to this
(the Snowmass05 energy was not optimum for m_Higgs=120GeV); it would be
best if all use the optimum energy to make comparisons/combinations
easier.
-----------------
I'm not sure what "ca" means? Does this agree with our CoM energy of 250?
Tuesday, February 5, 2008
Python Script for Pandora Calibration
I have written a python script to run through the Pandora Calibration:
It will run over all of the calibration constants, produce the root files containing the histograms, fit the histograms and find the calibration constant that gives the mean of the fit closest to the correct value.
I've tried to find a sensible way for the code to decide which calibration constants to try next, and to decide when to finish the calibration, but I wouldn't want to guarantee that it gives the best answers.
The middle part of the code creates a Marlin steering file with the chosen calibration constants, and the appropriate slcio input files, runs this through Marlin to produce the histograms, gets the appropriate histogram, fits it, and prints the mean of the fit.
I think this part is useful in itself, even if you are not convinced by the way I have chosen the calibration constants each time.
You could always write a different loop around this bit, that just iterates through a load of calibration constants, maybe.
Anyway, to run the code, first you need to set up pyroot:
export LD_LIBRARY_PATH=$ROOTSYS/lib:$PYTHONDIR/lib:$LD_LIBRARY_PATH
export PYTHONPATH=$ROOTSYS/lib:$PYTHONPATH
This worked with our pretty standard root installation, but if it doesn't there is more info here
The python code is here
It calls a bash script that makes the Marlin steering file.
To run this script use:
python DoPandoraCalib.py
You will need to change the "slciofile" string to your slcio files on lines 100, 110 etc.
You will also need to change the gear file in MakeSteeringFile.sh.
"debug" can be changed:
if debug = 0, only the best calibration constants that the code finds will be printed out.
if debug = 1, some info about the calib consts that are being tried will also be printed out.
if debug = 2, lots of info will be printed, and the histograms will be drawn.
The starting values, and iteration values (lines 95 and 96 etc) for the calib consts will need to be set to something sensible, i've set them to what i think is sensible for LDC01_05Sc.
The script puts all the root files in a directory called 'CalibRootFiles'.
The calibration constants found by the code for LDC01_05Sc are as follows:
CalibrECAL = 63.4, 126.9
CalibrHCAL = 41.25
ECALMIPcalibration = 147.8
HCALMIPcalibration = 34.2
ECALEMMIPToGeV = 0.00675
ECALHadMIPToGeV = 0.00675
HCALEMMIPToGeV = 0.035
HCALHadMIPToGeV = 0.035
Mark Thompson's default consts are:
CalibrECAL = 62.5 123.0
CalibrHCAL = 31.2
ECALMIPcalibration = 171
HCALMIPcalibration = 37.1
ECALEMMIPToGeV = 0.00593
ECALHadMIPToGeV = 0.00593
HCALEMMIPToGeV = 0.026
HCALHadMIPToGeV = 0.026
It will run over all of the calibration constants, produce the root files containing the histograms, fit the histograms and find the calibration constant that gives the mean of the fit closest to the correct value.
I've tried to find a sensible way for the code to decide which calibration constants to try next, and to decide when to finish the calibration, but I wouldn't want to guarantee that it gives the best answers.
The middle part of the code creates a Marlin steering file with the chosen calibration constants, and the appropriate slcio input files, runs this through Marlin to produce the histograms, gets the appropriate histogram, fits it, and prints the mean of the fit.
I think this part is useful in itself, even if you are not convinced by the way I have chosen the calibration constants each time.
You could always write a different loop around this bit, that just iterates through a load of calibration constants, maybe.
Anyway, to run the code, first you need to set up pyroot:
export LD_LIBRARY_PATH=$ROOTSYS/lib:$PYTHONDIR/lib:$LD_LIBRARY_PATH
export PYTHONPATH=$ROOTSYS/lib:$PYTHONPATH
This worked with our pretty standard root installation, but if it doesn't there is more info here
The python code is here
It calls a bash script that makes the Marlin steering file.
To run this script use:
python DoPandoraCalib.py
You will need to change the "slciofile" string to your slcio files on lines 100, 110 etc.
You will also need to change the gear file in MakeSteeringFile.sh.
"debug" can be changed:
if debug = 0, only the best calibration constants that the code finds will be printed out.
if debug = 1, some info about the calib consts that are being tried will also be printed out.
if debug = 2, lots of info will be printed, and the histograms will be drawn.
The starting values, and iteration values (lines 95 and 96 etc) for the calib consts will need to be set to something sensible, i've set them to what i think is sensible for LDC01_05Sc.
The script puts all the root files in a directory called 'CalibRootFiles'.
The calibration constants found by the code for LDC01_05Sc are as follows:
CalibrECAL = 63.4, 126.9
CalibrHCAL = 41.25
ECALMIPcalibration = 147.8
HCALMIPcalibration = 34.2
ECALEMMIPToGeV = 0.00675
ECALHadMIPToGeV = 0.00675
HCALEMMIPToGeV = 0.035
HCALHadMIPToGeV = 0.035
Mark Thompson's default consts are:
CalibrECAL = 62.5 123.0
CalibrHCAL = 31.2
ECALMIPcalibration = 171
HCALMIPcalibration = 37.1
ECALEMMIPToGeV = 0.00593
ECALHadMIPToGeV = 0.00593
HCALEMMIPToGeV = 0.026
HCALHadMIPToGeV = 0.026
Monday, January 28, 2008
Calorimeter calibration
Hajrah and Alex, from Edinburgh, found in their analyses a mass distribution of the Higgs quite high and wide when the mass is reconstructed from the b jets. That led us to think if there is a problem with the calibration. As an exercise I ran the PandoraPFACalibrator processor and comparing the processors SimpleCaloDigi and MokkaCaloDigi, using the detector model LDC01_05Sc I found that for the ECAL calibration the energy distribution of 10 GeV photons is fine for both processors. But for the 10 GeV KL using SimpleCaloDigi the energy distribution is wide and peaks at an energy higher than 10 GeV.
Here are the plots for the energy distribution of the
Here are the plots for the energy distribution of the
The solid black line comes from MokkaCaloDigi; dashed red line comes from SimpleCaloDigi.
We are using SimpleCaloDigi in our reconstruction. That might be the reason of the wrong mass distribution of the Higgs candidates. The next step is to check the calibration constants for LDC01Sc, as this is the model used for our samples, change to MokkaCaloDigi and see if the results improve.
We are using SimpleCaloDigi in our reconstruction. That might be the reason of the wrong mass distribution of the Higgs candidates. The next step is to check the calibration constants for LDC01Sc, as this is the model used for our samples, change to MokkaCaloDigi and see if the results improve.
Tuesday, January 15, 2008
Minutes of Meeting 15/1/08
Our first regular meeting was held at 4pm on Monday. The main points were:
- Mark's Plots (see previous entry): number of jets and mass of the Z as as function of KT cut. Things look sensible in the limits, but he has discovered a bug that can double count if more than one MC particle is in the same jet (this gives the peaks at mZ=180 GeV etc.)
- Energy: it was agreed to make future samples with c.of.m energy 250 GeV as specified in the benchmarks
- Effort: Mark is working full time, with support from Clare. Victoria and Robervaal are working on other things at the moment but should start ramping up soon. Hajrah is coming up to speed, and will start looking at muon ID. There is also an undergraduate at Bristol who will look at electron ID, and an undergraduate at Edinburgh who will be asked to make a short presentation soon.
- Next Steps: Mark's priority will be to assemble all of the machinery for a crude analysis, so we can see where the work needs to be done in tuning etc.
- Future Meetings: We will switch to 1pm on Mondays in future. Next meeting is next week.
Monday, January 14, 2008
Plots for the meeting.
Here are some plots I'm going to talk about in today's meeting. I'll write something about them in the blog later, but the meeting starts in 5 minutes so I don't have time now.
Things as a function of kT up to 0.01 and up to 1.5. Note that the second set of plots has a double counting error if the positron and the electron are in the same jet (so the Z mass plot is double what it should be), I'm running up a corrected plot but it's not finished yet.
Addendum 15/Jan/08
Here's the corrected plot for the kT values up to 1.5, and also one for kT values up to 5*10^-4.
The reconstructed Z mass peak is still very low; I guess it will peak when every particle is in its own jet since the extra reconstructed neutrals are double counting energy (the charged particle gets its energy from the track momentum, so any counting of its clusters is double counting). You can sort of see this because the plot seems to spread higher more so than lower, although you can only see it for the first kT bin; I'm having "issues" getting Paida to rotate the plots appropriately.
I'll reverse the plots, i.e plot the kT as a function of the other stuff which should give me a decent kT cut to work with. I'll try and get some rudimentary electron selection done by the end of the week and then move on to the rest of the analysis framework, leaving someone else to worry about the details.
Things as a function of kT up to 0.01 and up to 1.5. Note that the second set of plots has a double counting error if the positron and the electron are in the same jet (so the Z mass plot is double what it should be), I'm running up a corrected plot but it's not finished yet.
Addendum 15/Jan/08
Here's the corrected plot for the kT values up to 1.5, and also one for kT values up to 5*10^-4.
The reconstructed Z mass peak is still very low; I guess it will peak when every particle is in its own jet since the extra reconstructed neutrals are double counting energy (the charged particle gets its energy from the track momentum, so any counting of its clusters is double counting). You can sort of see this because the plot seems to spread higher more so than lower, although you can only see it for the first kT bin; I'm having "issues" getting Paida to rotate the plots appropriately.
I'll reverse the plots, i.e plot the kT as a function of the other stuff which should give me a decent kT cut to work with. I'll try and get some rudimentary electron selection done by the end of the week and then move on to the rest of the analysis framework, leaving someone else to worry about the details.
Wednesday, January 2, 2008
PandoraPFA Calibration
I had a go at the Pandora calibration, it's well documented and seems straight forward enough. I've put a script for a Grid job in the subversion repository to create all the samples necessary so it can be repeated if the detector changes (https://svn.phy.bris.ac.uk/svn/ilc_z_higgs/MarksStuff/PandoraCalibrationEventGeneration).
The files I generated for LDC00SC are in "/grid/ilc/users/phmag/LCIOOutput/PandoraCalibration".
I've not done any fitting with Root before, so for a first pass just centred the histograms by eye.
The constants I came up with are:
MokkaCaloDigi:
CalibrECAL=27, 81
CalibrHCAL=27.3
PandoraPFA:
ECALMIPcalibration=230
HCALMIPcalibration=31.5
ECALEMMIPToGeV=0.0045
ECALHadMIPToGeV=0.0045
HCALEMMIPToGeV=0.0353
HCALHadMIPToGeV=0.0353
I'll re-run the reconstruction of all the Z Higgs samples with these constants tomorrow.
The files I generated for LDC00SC are in "/grid/ilc/users/phmag/LCIOOutput/PandoraCalibration".
I've not done any fitting with Root before, so for a first pass just centred the histograms by eye.
The constants I came up with are:
MokkaCaloDigi:
CalibrECAL=27, 81
CalibrHCAL=27.3
PandoraPFA:
ECALMIPcalibration=230
HCALMIPcalibration=31.5
ECALEMMIPToGeV=0.0045
ECALHadMIPToGeV=0.0045
HCALEMMIPToGeV=0.0353
HCALHadMIPToGeV=0.0353
I'll re-run the reconstruction of all the Z Higgs samples with these constants tomorrow.
Subscribe to:
Posts (Atom)