What is an overlap criterion?

Halachev 7 years ago 0

An overlap criterion is used to define when is a region from a dataset "overlaps" with a particular annotation.

EpiExplorer supports three main overlap criteria:

  • Any overlap (at least 1bp)
  • Medium overlap (at least 10%)
  • Strong overlap (at least 50%)

Selecting an overlap criterion "Strong overlap (at least 50%)" means that in order for a region from the dataset to be considered "overlapping" with a particular annotation, at least 50% of the region need to be covered by the annotation.

Example: If a dataset contains a region (chr1, 1000,2000) and we are computing with an annotation that is defined in a region (chr1,1850,2050) then the part of the dataset region that is covered by the annotation is (chr1, 1850,2000) which is 150bp. The overlap ratio for the dataset region for this annotation then is 150bp from 1000bp or 15%. This means that if you select a cutoff of 10% this region will be counted, but if you select a cutoff of 50%, it won't.

The choice of the three criteria may seem arbitrary, but the feedback we've had shows that these cover the majority of scenarios. Also they are fixed only for convenience. If needed EpiExplorer allows to customize the criteria by visualizing the distribution of the overlap ratios and filtering by it. This way you can achieve any arbitrary criteria that you prefer.