Fasta: Difference between revisions
CSV import |
CSV import |
||
| Line 1: | Line 1: | ||
''' | {{DISPLAYTITLE:Fasta}} | ||
==History== | |||
FASTA was | == Overview == | ||
[[File:Chrysolina_fastuosa_(copula).ogv|thumb|right|Chrysolina fastuosa in copula]] | |||
The | '''Fasta''' is a text-based format for representing nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. | ||
== | |||
== History == | |||
Example | The FASTA format was first introduced in the 1980s as part of the FASTA software package, which was developed for sequence alignment. The format has since become a standard in bioinformatics for sequence data exchange. | ||
== Format Description == | |||
A FASTA file begins with a single-line description, followed by lines of sequence data. The description line starts with a greater-than (''>'') symbol, followed by a sequence identifier and optional description. The sequence data follows, with each line typically not exceeding 80 characters. | |||
=== Example === | |||
<pre> | |||
==Applications== | >sequence_1 Homo sapiens | ||
FASTA is widely used in [[bioinformatics]] for | ATGCGTACGTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGC | ||
</pre> | |||
== Applications == | |||
FASTA format is widely used in [[bioinformatics]] for storing and sharing [[DNA]], [[RNA]], and [[protein]] sequences. It is compatible with many bioinformatics tools and databases, such as [[BLAST]], [[GenBank]], and [[UniProt]]. | |||
== | |||
FASTA | == Advantages == | ||
The simplicity and flexibility of the FASTA format make it easy to parse and manipulate. It is human-readable and can be easily edited with any text editor. | |||
* [[ | == Limitations == | ||
FASTA format does not support rich metadata or annotations beyond the simple description line. For more complex data, formats like [[GenBank format]] or [[GFF]] may be more appropriate. | |||
== Related Pages == | |||
* [[FASTA software]] | |||
* [[Sequence alignment]] | * [[Sequence alignment]] | ||
* [[Bioinformatics]] | * [[Bioinformatics]] | ||
* [[ | * [[GenBank]] | ||
* [[ | * [[UniProt]] | ||
[[Category:Bioinformatics]] | [[Category:Bioinformatics]] | ||
[[Category: | [[Category:File formats]] | ||
Revision as of 11:20, 15 February 2025
Overview
File:Chrysolina fastuosa (copula).ogv Fasta is a text-based format for representing nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences.
History
The FASTA format was first introduced in the 1980s as part of the FASTA software package, which was developed for sequence alignment. The format has since become a standard in bioinformatics for sequence data exchange.
Format Description
A FASTA file begins with a single-line description, followed by lines of sequence data. The description line starts with a greater-than (>) symbol, followed by a sequence identifier and optional description. The sequence data follows, with each line typically not exceeding 80 characters.
Example
>sequence_1 Homo sapiens ATGCGTACGTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGC
Applications
FASTA format is widely used in bioinformatics for storing and sharing DNA, RNA, and protein sequences. It is compatible with many bioinformatics tools and databases, such as BLAST, GenBank, and UniProt.
Advantages
The simplicity and flexibility of the FASTA format make it easy to parse and manipulate. It is human-readable and can be easily edited with any text editor.
Limitations
FASTA format does not support rich metadata or annotations beyond the simple description line. For more complex data, formats like GenBank format or GFF may be more appropriate.