This directory contains merged transcript models produced using the "anchored" merging procedure on PacBio HCGMs (High-Confidence Genome Mappings) within the GENCODE Capture Long-Seq project. The "compmerge" software (https://github.com/sdjebali/Compmerge) was used for the merging. All files correspond to genome assemblies hg38 and mm10. # File naming scheme: All_Cap1__.compmerge..gtf.gz where: species: "mm": mouse "hs": human tissue: self-explanatory, except: "all": transcript models merged across all available tissues merging_method: "anchor": files produced with the "anchored" merging procedure (see Methods section of the paper) "noAnchor": files produced with the standard merging procedure (see Methods section of the paper) subset: "all": all merged transcript models, regardless of their end support "cageSupported": merged transcripts models whose 5' end is supported by a FANTOM5 CAGE TSS "polyASupported": merged transcripts models whose 3' end is supported by a captured polyA site (i.e., composed of poly-adenylated PacBio reads) "cage+polyASupported": full-length merged transcripts models whose: - 5' end is supported by a FANTOM5 CAGE TSS and - 3' end is supported by a captured polyA site (i.e., composed of poly-adenylated PacBio reads) # File format: GTF with the internal merged transcript model identifier as the "transcript_id" value, and the comma-separated list of contributing PacBio reads as the gene_id value. Note that when the value of the filename is "all", the gene_id attribute value is the list of contributing models merged *within* each tissue, and not the list of contributing PacBio reads.