IRTGJets1: Difference between revisions

From KIP Wiki
⧼kip-jumptonavigation⧽⧼kip-jumptosearch⧽
m (New page: = Jet Reconstruction Tutorial = In the following jets will be reconstructed from clusters of energy depositions in a calorimeter, using three different jet algorithms: * Kt * AntiKt * Cam...)
 
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Jet Reconstruction Tutorial =
= Jet Reconstruction =


In the following jets will be reconstructed from clusters of energy depositions in a calorimeter, using three different jet algorithms:
In the following jets will be reconstructed from clusters of energy depositions in a calorimeter, using three different jet algorithms:
Line 9: Line 9:
== Motivation ==
== Motivation ==


As quarks and gluons can not be directly observed due to color confinement, algorithms have to be defined to construct physics objects (jets) ideally having the four-vector of the gluon or quark.
As quarks and gluons can not be directly observed due to color confinement, algorithms have to be defined to construct physics objects (jets) from three possible types of input:
These jet algorithms can be applied on different levels of input:
* Parton Level - using the final state particles of the hard interaction
* Parton Level - using the final state particles of the hard interaction
* Particle Level - using the final state particles after hadronization of any quarks and gluons
* Particle Level - using the final state particles after hadronization of any quarks and gluons
* Calorimeter Level - using energy depositions in a calorimeter, either cells or clusters of cells
* Calorimeter Level - using energy depositions in a calorimeter, either cells or clusters of cells
Ideally a jet would have the same properties on all three levels, independent of inputs.
Ideally a jet would have the same properties on all three levels.
The, at first look, most intuitive jet algorithm is to sum up objects (energy depositions, particles, ...) inside a cone of fixed size around the axis of the jet. But the resulting Cone algorithms usually have a number of problems (infrared safety, collinear safety, seeds, ...) and require extra split and merge steps to result in sensible jets.
The at first look most intuitive jet algorithm is to sum up objects (energy depositions, particles, ...) inside a cone of fixed size around the axis of the jet. But the resulting Cone algorithms usually have a number of problems (infrared safety, collinear safety, seeds, ...) and require extra split and merge steps to result in sensible jets.


Thus an alternative will be used here, the so called kT algorithm which is more straightforward in implementation. In this tutorial two other algorithms can be obtained by changing one variable. The underlying principle is the following:
Thus the called AntikT clustering algorithm is used here. In this exercise two other algorithms can be obtained by changing one variable. The underlying principle is the following:


=== kT jet algorithm ===
=== Clustering algorithm ===
The algorithm starts with a list of vectors which e.g. can be a list of cells or clusters in the calorimeter, or a list of particles. These objects will be called preclusters in the following.
The algorithm starts with a list of vectors which e.g. can be a list of cells or clusters in the calorimeter, or a list of particles. These objects will be called preclusters in the following.
For each precluster ''i'' a variable ''d'' is calculated:
For each precluster ''i'' a variable ''d'' is calculated:
Line 35: Line 34:
* p = 0: Cambridge/Aachen
* p = 0: Cambridge/Aachen
* p = -1: AntiKt
* p = -1: AntiKt
Lets first have a look at ''p'' = -1, the AntiKt jet algorithm. Naively it starts with the precluster with the highest transverse momentum and checks if there is another cluster closer than ''D'' (using distance in the eta/phi plane). If there is one the two are merged, if not then the precluster is defined as a jet. This continues until all preclusters are parts of a jet.
Lets first have a look at ''p'' = -1, the AntiKt jet algorithm. It results in very circular high pT jets, with lower pT jets being crescent shaped if they are too close to a high pT jet.
This results in very circular high pT jets, with lower pT jets being crescent shaped if they are too close to a high pT jet.


== Acquiring the files and first look ==
== Acquiring the files and first look ==
Line 63: Line 61:
t->MakeClass("somename")
t->MakeClass("somename")
but this was already done and some extras were added to the resulting files for convenience.
but this was already done and some extras were added to the resulting files for convenience.
In jetrec.h the class jetrec is defined and jetrec.C contains the actual analysis code. Open it with an editor of your choice. The main code will be in the Loop() function. There are markers set to indicate where the analysis code will have to be inserted. Other functions are already defined, e.g. to merge clusters, and some histograms are defined and filled.
Have a quick glance at jetrec.h first:
* the class jetrec is defined
* variables and their addresses are set to access the tree holding the data; most variables are vectors, containing one value for each cluster in an event.
jetrec.C contains the actual analysis code. Open it with an editor of your choice. The main code will be in the Loop() function. There are markers set to indicate where the analysis code will have to be inserted. Other functions are already defined, e.g. to merge clusters, and some histograms are defined and filled.


== Jet Reconstruction ==
== Jet Reconstruction ==
Line 78: Line 73:
And start the code, executing everything in the Loop() function for every event:
And start the code, executing everything in the Loop() function for every event:
t.Loop()
t.Loop()
But the Loop() function currently does not do anything useful.
But the Loop() function currently does not do much, except for obtaining the input clusters and filling a simple distribution of the number of clusters per event. You can draw that distribution by doing:
As a first step simple plots probably already studied using the TBrowser can be added, e.g. the number of clusters per event:
h_nclusters->Fill(cl_nc_topoEM430);
Reload and rerun the code to look at the resulting histogram:
h_nclusters->Draw()
h_nclusters->Draw()

=== Creating Jets ===
=== Creating Jets ===
To move towards the reconstruction of jets, lets first try removing clusters from the precluster list and adding them to the list of jets. A loop over the list of clusters can be used:
To move towards the reconstruction of jets, lets first try removing clusters from the precluster list and adding them to the list of jets. A loop over the list of clusters can be used:
Line 98: Line 91:
t.Loop()
t.Loop()
h_jet_num->Draw()
h_jet_num->Draw()
While jets are built out of single clusters so far, this distribution differs from the number of clusters previously studied, as only objects with more than 5 GeV of transverse momentum are considered as jets.
While jets are built out of single clusters so far without any merging, this distribution differs from the number of clusters previously studied, as only objects with more than 5 GeV of transverse momentum are considered as jets.


=== Calculating d_i and d_ij===
=== Calculating d_i and d_ij===
Line 125: Line 118:
}
}
}
}
Now the minimum value of d_i is found, at position min_di_i, and stored in min_di. Similarly the minimum of the d_ij values is min_dij, with i and j stored as min_dij_i and min_dij_j.
The minimum value of d_i is found, at position min_di_i, and stored in min_di. Similarly the minimum of the d_ij values is min_dij, with i and j stored as min_dij_i and min_dij_j.
If min_dij is lower than min_di, two clusters have to be merged:
If min_dij is lower than min_di, two clusters have to be merged:
if (min_dij < min_di) {
if (min_dij < min_di) {
Line 157: Line 150:
D influences the size of the jets. A larger D will result in less, more energetic jets, while a smaller D will reconstruct more smaller jets.
D influences the size of the jets. A larger D will result in less, more energetic jets, while a smaller D will reconstruct more smaller jets.


Playing around with these variables, the histograms showing jet distributions can be used to compare results. Histograms can e.g. be saved as root files from the File menu of the histogram window, to compare your different results e.g. in a TBrowser. The option "same" can be set at the top right to draw histograms on top of already existing ones. With a right click on the line of a histogram, its colour can be set by SetLineAttributes (all this can of course also be done using the command line).
Playing around with these variables, the histograms showing jet distributions can be used to compare results.


== Author ==
== Author ==

Latest revision as of 09:29, 11 May 2010

Jet Reconstruction

In the following jets will be reconstructed from clusters of energy depositions in a calorimeter, using three different jet algorithms:

  • Kt
  • AntiKt
  • Cambridge/Aachen

Resulting jet distributions can be plotted to verify that the reconstruction is working, and to compare different jet algorithms.

Motivation

As quarks and gluons can not be directly observed due to color confinement, algorithms have to be defined to construct physics objects (jets) from three possible types of input:

  • Parton Level - using the final state particles of the hard interaction
  • Particle Level - using the final state particles after hadronization of any quarks and gluons
  • Calorimeter Level - using energy depositions in a calorimeter, either cells or clusters of cells

Ideally a jet would have the same properties on all three levels. The at first look most intuitive jet algorithm is to sum up objects (energy depositions, particles, ...) inside a cone of fixed size around the axis of the jet. But the resulting Cone algorithms usually have a number of problems (infrared safety, collinear safety, seeds, ...) and require extra split and merge steps to result in sensible jets.

Thus the called AntikT clustering algorithm is used here. In this exercise two other algorithms can be obtained by changing one variable. The underlying principle is the following:

Clustering algorithm

The algorithm starts with a list of vectors which e.g. can be a list of cells or clusters in the calorimeter, or a list of particles. These objects will be called preclusters in the following. For each precluster i a variable d is calculated:

d_i = pT_i^(p * 2)

where pT_i is the transverse momentum of that precluster and p is a parameter to define the algorithm. Furthermore for each pair of preclusters i and j (i!=j) a d is calculated:

d_ij = min( pT_i^(p * 2), pT_j^(p * 2) ) * ((eta_i - eta_j)² + (phi_i - phi_j)²) / D²

where D is a parameter influencing the size of the resulting jets. The minimum of all d_i and d_ij values is determined. If this is one of the d_ij values, the preclusters i and j are merged. If the minimum is one of the d_i values, the precluster is removed from the list and added to the list of jets. This is done until the list of preclusters is empty.

The parameter p provides the choice of the following jet algorithms:

  • p = 1: kT
  • p = 0: Cambridge/Aachen
  • p = -1: AntiKt

Lets first have a look at p = -1, the AntiKt jet algorithm. It results in very circular high pT jets, with lower pT jets being crescent shaped if they are too close to a high pT jet.

Acquiring the files and first look

The files required for this Tutorial can be downloaded from: [1]

After the download put the tar.gz into a work directory and extract the contents, e.g. like:

mkdir $HOME/jet_tut
mv jetrec.tar.gz $HOME/jet_tut/.
cd $HOME/jet_tut
tar xvzf jetrec.tar.gz

3 files (plus one backup) are contained in the archive:

cluster_sample.root

This is a flat root ntuple containing variables that describe 10000 events of simulated prompt photon production at an LHC experiment. The ntuple contains vectors with variables for all clusters of an event.

To have a quick look into the data structure and at first distributions, a TBrowser can be used:

root -l cluster_sample.root
new TBrowser

In the left hand pane, navigate to ROOT files, the file and then the tree called CollectionTree. To the right the leaves containing the data are now shown. A double click draws a simple histogram of one variable.

jetrec.C and jetrec.h

The jet reconstruction is done on code created using the root "MakeClass" command. You could create your own class by doing:

TTree *t = gDirectory->Get("CollectionTree")
t->MakeClass("somename")

but this was already done and some extras were added to the resulting files for convenience. In jetrec.h the class jetrec is defined and jetrec.C contains the actual analysis code. Open it with an editor of your choice. The main code will be in the Loop() function. There are markers set to indicate where the analysis code will have to be inserted. Other functions are already defined, e.g. to merge clusters, and some histograms are defined and filled.

Jet Reconstruction

First Steps

The code can be run out of the box. Start root:

root -l

Load the jetrec files:

.L jetrec.C

Create an instance of the jetrec class:

jetrec t

And start the code, executing everything in the Loop() function for every event:

t.Loop()

But the Loop() function currently does not do much, except for obtaining the input clusters and filling a simple distribution of the number of clusters per event. You can draw that distribution by doing:

h_nclusters->Draw()

Creating Jets

To move towards the reconstruction of jets, lets first try removing clusters from the precluster list and adding them to the list of jets. A loop over the list of clusters can be used:

while (cl_pt.size()>1) { // while the list has at least 2 entries
  // ------------------------------------------------------------------
  // insert further code directly above this line	
  add_cluster_as_jet(min_di_i); //add the cluster to the jet list and
  erase_cluster(min_di_i); //delete the cluster
}
add_cluster_as_jet(0); // loop stopped with one cluster left, add that, too

min_di_i is always zero here, thus pointing at the first cluster in the vector, but will be useful later. Likewise stopping the loop when one cluster is still left, and adding that one by extra code does not make any sense at the moment but will be necessary later. You can now rerun the code and draw the number of jets:

.L jetrec.C
jetrec t
t.Loop()
h_jet_num->Draw()

While jets are built out of single clusters so far without any merging, this distribution differs from the number of clusters previously studied, as only objects with more than 5 GeV of transverse momentum are considered as jets.

Calculating d_i and d_ij

The next step is to calculate the variables d_i for every cluster, and look for the minimum. Add the following code, at the top of the loop over all clusters (direcly below while ...):

min_di_i = 0; // reset to position 0 for this event
min_di = pow(cl_pt.at(0),para*2);
for (Int_t i = 0; i < cl_pt.size(); i++) { // a loop over all clusters
  di = pow(cl_pt.at(i),para*2); // calculate d
  if (di > min_di) continue; // if its not the smallest of the d's so far, go to the next one
  min_di = di; // if it is the minimum so far, safe the value
  min_di_i = i; // and position
}

Now the values d_ij for pairs of clusters are needed. To calculate them another loop over clusters is added directly below the last piece of code:

min_dij_i = 0; // reset again
min_dij_j = 1;
min_dij = min(pow(cl_pt.at(0),para*2), pow(cl_pt.at(1),para*2));
for (Int_t i = 0; i < cl_pt.size(); i++) { // loop over all clusters
  for (Int_t j = i+1; j < cl_pt.size(); j++) { // loop over all clusters where i!=j
     dij = min(pow(cl_pt.at(i),para*2), pow(cl_pt.at(j),para*2));  //calculate d_ij in three steps (to break the large formula down a bit)
     dij = dij * (pow(cl_eta.at(i)-cl_eta.at(j),2)+pow(cl_phi.at(i)-cl_phi.at(j),2));
     dij = dij / pow(D,2);
     if (dij > min_dij) continue; // if its not the smallest d_ij, go to the next pair
     min_dij = dij; // if it is the minimum, save the value and
     min_dij_i = i; // position i and
     min_dij_j = j; // position j
  }
}

The minimum value of d_i is found, at position min_di_i, and stored in min_di. Similarly the minimum of the d_ij values is min_dij, with i and j stored as min_dij_i and min_dij_j. If min_dij is lower than min_di, two clusters have to be merged:

if (min_dij < min_di) {
  // if one of the d_ij's is the minimum, merge the clusters into a new one, delete the old ones
  merge_clusters(min_dij_i, min_dij_j);
  //erase cluster i
  erase_cluster(min_dij_i);
  //erase cluster j
  erase_cluster(min_dij_j);
  continue;
}

This is done using functions in jetrec.C that were predefined for this tutorial. If the absolute minimum is a d_i value, we have a jet candidate and remove the cluster. This code was already put in earlier in the tutorial (and now makes more sense). It should have ended up directly below the last code snippet.

That's it, the jet algorithm is finished. Rerunning the code the following histograms can be studied using Draw():

  • h_jet_num : number of jets reconstructed
  • h_jet_pt : transverse momentum distribution of all jets
  • h_jet_eta : distribution of jet pseudorapidities
  • h_jet_phi : distribution of jet phi

Congratulations, you just reconstructed jets from a list of clusters of energy depositions in a calorimeter.

Further Tasks

At the very beginning of jetrec.C, two variables are defined:

Int_t para = -1;
Double_t D = 0.6;

para is the parameter p from the introduction, and chooses the jet algorithm to be used:

  • para = 1; // kT
  • para = 0; //Cambridge/Aachen
  • para = -1; //AntiKt

D influences the size of the jets. A larger D will result in less, more energetic jets, while a smaller D will reconstruct more smaller jets.

Playing around with these variables, the histograms showing jet distributions can be used to compare results.

Author

(c) Frederik Ruehr