Description

This track contains merged transcript models obtained in the GENCODE Capture Long-Seq (CLS) project, phase 1.

This set of merged transcript models was obtained by merging aligned PacBio reads with compatible intron/exon structures with compmerge, using the anchored procedure. This approach prevents the transcripts whose end(s) are supported by CAGE or polyA data from being merged into a longer, compatible transcript "container". This preserves all supported transcript ends in the output, including "internal" sites.

All merged transcript models are derived from aligned PacBio reads with the following properties:

If spliced, all their introns must be canonical (GT|GC / AG).
If monoexonic, they must bear a detectable polyA tail.

See Lagarde et al. for more details.

Hi-Seq Support

HiSeq-supported sets contain merged transcript models that are either mono-exonic (in that case HiSeq support is not applicable), or spliced. If spliced, their entire intron chain is supported by captured HiSeq data, in the form of at least one spliced HiSeq read with the exact same coordinates and strand.

Display Conventions and Configuration

Red Transcript models with novel intron chains (with respect to GENCODE 20/M3) according to comptr.
Black Transcript models with known intron chains (with respect to GENCODE 20/M3) according to comptr, or monoexonic.