Reference¶
ce_detector.detector¶
class for detecting junction reads
-
class
ce_detector.detector.JunctionDetector(bam_file, reference, quality=0, output=None)[source]¶ class for detecting junction reads and record position
- Parameters
bam_file (str) – bam file
output (str) – filename of output
reference (str) – filename of genome reference
quality (int) – quality for filtering junction reads
-
static
check_strand(anchor, acceptor)[source]¶ check type of strand
- Parameters
anchor (str) – anchor of read
acceptor (str) – acceptor of read
- Returns
type of strand (-|+)
- Return type
str
-
run(logger, verbose=False)[source]¶ detect junction reads and annotate slice site, write results to file
- Returns
instance from junctionmap
- Return type
instance
-
worker(bam_file, reference, chrom, quality, idn, junctionmap)[source]¶ find junction reads and annotate slice site
- Parameters
bam_file (instance) – handle of bam_file
reference (instance) – handle of reference
chrom (str) – chromosome
quality (int) – quality for filtering reads
idn (int) – identifier of reads
junctionmap (instance) – instance from junctionmap
- Returns
instance from junctionmap
- Return type
instance
-
class
ce_detector.detector.JunctionMap[source]¶ build a class to store information of all junction reads
-
class
ce_detector.detector.Read(chrom, start, end, idn, score, strand, anchor, acceptor)[source]¶ build a read class for storing information of every junction read
- Parameters
chrom (str) – chromosome of genome
start (int) – start position of junction read
end (int) – end position of junction read
idn (int) – index of junction read
score (int) – support of junction read
strand (str) – direction of junction read (-|+)
anchor (str) – anchor of junction read
acceptor (str) – acceptor of junction read
ce_detector.annotator¶
class for annotating junction reads
-
class
ce_detector.annotator.Annotator(junctionmap, database: Any, output=None)[source]¶ annotate junction reads
- Parameters
junctionmap (instance) – instance return
ce_detector.detector.JunctionMapdatabase (Any) – database of annotation files
output (TestIo) – filename of annotated junction reads. Defaults to None
-
annotate_junction(read, result, db)[source]¶ annotate junction reads and write results to file
- Parameters
read (instance) – junction read return
ce_detector.detector.Readresult (defaultdict[Any, Any]) – gene list used for annotation
db (instance of file) – database of annotation file
-
static
detect_property(start, end, junction_list)[source]¶ detect type of slice, number of skipped donors and number of skipped acceptors
type of slice including D A DA N NDA
- Parameters
start (int) – start of junction read
end (int) – end of junction read
junction_list (numpy.array) – gene list of junction reads
- Returns
type of slice, number of skipped donors, number of skipped acceptors
ce_detector.scanner¶
class for scanning cryptic exons based on previous _result :: Junction detector and junction annotator
-
class
ce_detector.scanner.Scanner(cutoff, output)[source]¶ class for scanning cryptic exons based on annotated junction reads
- Parameters
cutoff (int) – cutoff used to filter junction reads with relatively low score or depth
output (str) – filename of _result
-
run(junctionmap, logger, verbose=False) → Iterable[source]¶ run program to scan cryptic exons
- Parameters
verbose –
logger –
junctionmap (instance) – instance from
ce_detector.detector.JunctionMap
- Returns
temporary result used to store cryptic exons
- Return type
Iterable
-
ce_detector.scanner.assign_value(df_ce, ces, ns, ce_id, ns_id) → None[source]¶ assign value of child column for every cryptic exons that contains junction reads with N type
Given the start and end of cryptic exons and junction reads, Note: types of junction reads contains N, D, A, DA, NDA. For details: https://regtools.readthedocs.io/en/latest/commands/junctions-annotate/
- Parameters
df_ce (pandas.DataFrame) – pandas.DataFrame of cryptic exons
ces (pandas.DataFrame) – cryptic exons’ pandas.DataFrame, which has gene ids that both cryptic exon and junction read own. It only contain gene id, start and end
ns (pandas.DataFrame) – pandas.DataFrame of junction reads whose type is N, as well as has same gene id as cryptic exons
ce_id (numpy.array) – index of gene id of cryptic exons which has junction reads as children
ns_id (numpy.array) – index of gene id of junction reads with N type, which have junction reads as children
-
ce_detector.scanner.check(axis) → numpy.array[source]¶ check if cryptic exon has children
Which means that check if the cryptic exon is split by others junction reads in terms of start and end position
- Parameters
axis (numpy.array) – an array, a junction read, included start and end
- Returns
whether junction read can split cryptic exon
- Return type
bool
-
ce_detector.scanner.find_ce(groups) → Iterable[source]¶ parse _result getting from annotations in order to detect cryptic exons
- Parameters
groups (Groupby object return from
Dataframe.groupby) – annotated junction reads are grouped by strand and type- Returns
pd.DataFrame of two strands
- Return type
Iterable
-
ce_detector.scanner.split_ce(df_ce, df_n) → Iterable[source]¶ Iterator: check whether detected cryptic exons are split by other junction reads
- Parameters
df_ce (pandas.DataFrame) – pandas.DataFrame of cryptic exons return from
find_ce()df_n (pandas.DataFrame) – pandas.DataFrame of junction reads with N type
- Returns
param:df_ce with new column children
- Return type
iterator