Extracts sequences from a fasta file for each of the intervals defined in a BED file

get_fasta_from_bed(bedFile, fasta_file, write_output = FALSE)

Arguments

bedFile

Either a path or a connection to the bed file, file format should be standard UCSC bed format with column: 1. chromosome-id, 2. start, 3. end, 4. name, 5. score, 6. strand ** The file should be without column names.

fasta_file

Either a path or a connection to reference multi-fasta file, from which subset of sequences for given input feature is to be retrieved. In the sequence header: only string before first space and/or first colon (:) will be considered for further processes. **Important consideration when header have long names.

outfile

Logical, to return output sequences as a output multi-fasta file or not, default: FALSE

Value

sequences in given bed region

Examples

if (FALSE) { bed_file_in <- system.file("exdata","Sc_ref_genes.bed", package = "fastaR") ref_fasta <- system.file("exdata", "Sc_ref_genome.fasta", package = "fastaR") fastaR::get_fasta_from_bed(bedFile=bed_file_in, fasta_file=ref_fasta, write_output=FALSE) }