------------------------------------------------------ EMBL NUCLEOTIDE SEQUENCE DATABASE SUBMISSION FORM HOW TO USE THIS FORM - PLEASE READ FIRST 1) WEBIN: THE WORLD WIDE WEB SUBMISSION TOOL ============================================ If you have access to the World Wide Web then DO NOT use this form. Use the WebIn form on the World Wide Web at ############################################## # http://www.ebi.ac.uk/submission/webin.html # ############################################## If you do not have access to the World Wide Web then please use this form and email it to DATASUBS@EBI.AC.UK. It is only necessary to submit to one database. Public data are exchanged between EMBL, GenBank and DDBJ on a daily basis. 2) MULTIPLE SUBMISSIONS ======================= If you have more than one but less than 25 sequences to submit, copy this form and send all the submissions together in one email with a note saying how many sequences you are sending. 3) BULK SUBMISSIONS =================== If you have more than 25 related sequences to submit DO NOT send them all using this form. Instead email DATASUBS@EBI.AC.UK and include the following information a) how many sequences you are going to submit b) a short explanation of how the sequences are related c) what type of differences there are between the entries (e.g. isolate) d) one completed email submission form as an example You will be contacted by a curator who will create a template for you which you should then use to submit all of the sequences. 4) UPDATES ========== DO NOT use this form for submitting updates or corrections. If you are sending an update please complete the update form available on the web at: http://www.ebi.ac.uk/ebi_docs/update.html or get a copy of the update form via anonymous FTP: ftp://ftp.ebi.ac.uk/pub/databases/embl/release/update.doc If you need help with updates contact UPDATE@EBI.AC.UK 5) PROTEIN SEQUENCES ==================== DO NOT use this form to submit protein sequences. For submissions to the SWISS-PROT protein sequence databank access the World Wide Web at http://www.ebi.ac.uk/ebi_docs/swissprot_db/swisshome.html or email DATALIB@EBI.AC.UK 6) ACCESSION NUMBERS AND CONFIDENTIALITY ======================================== Your data can be made public immediately, or they can be kept confidential until a release date which you provide. Confidential data are ALWAYS made available to the public after publication. If your data contain all the information we require we will assign unique accession numbers within two working days. We will email you to tell you the new accession numbers. You should submit your sequence data BEFORE you have galley proofs. We suggest that the following text be used to cite the accession number(s) in publication(s): "The nucleotide sequence data reported in this paper will appear in the DDBJ/EMBL/GenBank Nucleotide Sequence Database under the accession number(s) ________" 7) FORM FILLING INSTRUCTIONS ============================ <============== DO NOT EXCEED THIS LINE WIDTH IN YOUR REPLY ==============> To display this form properly choose a fixed width font (e.g. Courier) in your editor. If you are saving files in a word processing program then please save the file as TEXT ONLY WITH LINE BREAKS. (To do this in Microscoft Word you will need to choose File, Save as, Save file type as, and select Text only with line breaks). Please do not send files that are saved in Word or Wordperfect format. Processing of the submission may be delayed if your email is text wrapped, encoded or binhexed. ######################################################################## # Fill in the form as follows: # # a) if there is a colon : then enter text (e.g. Last name : Smith) # # b) if there is an empty box [ ] and if the answer is yes then fill # # the box with an X (e.g. Genomic DNA [X]) # # c) if the option is not relevant then do not enter any text and/or # # do not write an X in the box. # # d) DO NOT delete lines from this form. # ######################################################################## 8) ENTERING FEATURES AND LOCATIONS ================================== Enter the feature key from the list given in Appendix I at the end of this document. Enter the locations, gene name, product name, and EC number, where appropriate. Use < and > in the locations to show whether the feature is partial at the 5' end and/or the 3' end. Mark with an X in the box [ ] if the feature is on the complementary strand and if you have experimental evidence for the feature. If you do not provide any features or adequate locations and names for the features you will be contacted for more information before an accession number is assigned to the sequence. For CDS features you must provide a gene name AND a product name, even if the product name is putative. If a CDS is partial at the 5' end then write the codon start number. This is the number (1,2 or 3) of the first base of the first complete codon of the translation. For example the following CDS is partial and the codon start is 2 because the first complete codon, T, starts with the base a, which is the second base in the feature. DNA tacatcgatg... Translation T S M... FEATURE EXAMPLE NO.1 Feature key :CDS >From :201 To :500 Gene name :abcD Product name :ABC repressor protein Codon start 1,2 or 3 : EC number : Complementary strand [ ] Experimental evidence [X] FEATURE EXAMPLE NO.2 Feature key :rRNA >From :<1 To :>1500 Gene name :16S rRNA Product name :16S ribosomal RNA Codon start 1,2 or 3 : EC number : Complementary strand [ ] Experimental evidence [ ] If you have further questions after reading this form please contact DATASUBS@EBI.AC.UK I. CONFIDENTIAL STATUS Enter an X if you want these data to be confidential [ ] If confidential write the release date here : (Date format DD-MMM-YYYY e.g. 30-JUN-1998) II. CONTACT INFORMATION Last name :$(LAST_NAME) First name :$(FIRST_NAME) Middle initials : Department :$(DEPT) Institution :$(INSTITUTION) Address :$(ADDRESS) : : Country :$(COUNTRY) Telephone :$(PHONE) Fax :$(TELEFAX) Email :$(MAIL) III. CITATION INFORMATION Author 1 :$(author_1) Author 2 :$(author_2) Author 3 :$(author_3) Author 4 :$(author_4) Author 5 :$(author_5) Author 6 :$(author_6) Author 7 :$(author_7) Author 8 :$(author_8) Author 9 :$(author_9) Author 10 :$(author_10) Author 11 :$(author_11) Author 12 :$(author_12) (e.g. Smith A.B.) (Copy line for extra authors) Title :$(title) Journal :$(journal) Volume :$(volume) First page :$(page_1) Last page :$(page_2) Year :$(year_pub) Institute (if thesis): Publication status Mark one of the following In preparation [ ] Accepted [x] Published [ ] Thesis/Book [ ] No plans to publish [ ] IV. SEQUENCE INFORMATION Sequence length (bp) :$(SEQ_LEN) Molecule type Mark one of the following Genomic DNA [ ] cDNA to mRNA [ ] rRNA [x] tRNA [ ] Genomic RNA [ ] cDNA to genomic RNA [ ] Mark if either of these apply Circular [ ] Checked for vector contamination [ ] V. SOURCE INFORMATION Organism :$(full_name) Sub species : Strain :$(strain) Cultivar : Variety : Isolate/individual : Developmental stage : Tissue type : Cell type : Cell line : Clone :$(clone) Clone (if >1) : Clone library : Chromosome : Map position : Haplotype : Natural host : Laboratory host : Macronuclear [ ] Mark one if immunoglobulin or T cell receptor Germline [ ] Rearranged [ ] Mark one if viral Proviral [ ] Virion [ ] Mark one if from an organelle Chloroplast [ ] Mitochondrion [ ] Chromoplast [ ] Kinetoplast [ ] Cyanelle [ ] Plasmid (not clone) [ ] Further source information (e.g. taxonomy, specimen voucher etc) Note :$(tax) VI. FEATURES OF THE SEQUENCE YOU MUST DESCRIBE AT LEAST ONE FEATURE OF THE SEQUENCE OR THERE WILL BE A DELAY IN THE PROCESSING OF YOUR SUBMISSION Complete the block below for every feature you need to describe. If you have more than one feature copy the block as many times as you require. For help see 8) ENTERING FEATURES AND LOCATIONS above. FEATURE NO.1 Feature key :$(seq_type) >From :$(start) To :$(end) Gene name :$(gene) Product name :$(gene_prod) Codon start 1,2 or 3 : EC number : Complementary strand [ ] Experimental evidence [ ] VII. SEQUENCE INFORMATION Enter the sequence data below (IUPAC nucleotide base codes, Nucl. Acids Res. 13: 3021-3030, 1985) BEGINNING OF SEQUENCE: $(SEQUENCE) END OF SEQUENCE Include the translation for each CDS feature below. BEGINNING OF TRANSLATION: END OF TRANSLATION --------------------------------------------------------------------------- These data will be shared among the following databases: DDBJ Database (DNA Data Bank of Japan; Mishima, Japan); EMBL Nucleotide Sequence Database (EBI, Cambridge, UK); GenBank (NCBI, Bethesda, USA); SWISS-PROT Protein Sequence Database (Geneva, Switzerland and Heidelberg, FRG); International Protein Information Database in Japan (JIPID; Noda, Japan) Martinsried Institute For Protein Sequence Data (MIPS; Martinsried, FRG) National Biomedical Research Foundation Protein Identification Resource (NBRF-PIR; Washington, D.C., USA.) EMBL Data Submissions E-mail datasubs@ebi.ac.uk European Bioinformatics Inst. Telephone +44 (0)1223 494499 Hinxton Hall, Hinxton Telefax +44 (0)1223 494472 Cambridge CB10 1SD, UK --------------------------------------------------------------------------- APPENDIX I FEATURE KEYS ======================= A full description of features is found in the DDBJ/EMBL/GenBank Feature Table Definition Document at ftp://ftp.ebi.ac.uk/pub/databases/embl/release/ftable.doc and on the EBI website at http://www.ebi.ac.uk/ebi_docs/embl_db/ft/feature_table.html An abbreviated list of features keys is given below C_region constant region of immunoglobulin light and heavy chain, and T-cell receptor alpha, beta and gamma chains CAAT_signal eukaryotic promoter element; consensus=GG(C or T)CAATCT CDS protein coding sequence (includes stop codon) conflict the "same" sequence reported by different laboratories differ at this site or region D-segment diversity segment of immunoglobulin heavy chain and T-cell receptor beta-chain enhancer cis-acting enhancer of eukaryotic promoter function exon region that codes for part of spliced mRNA GC_signal eukaryotic promoter element; consensus=GGGCGG intron transcribed region excised by mRNA splicing J_segment joining segment of immunoglobulin light and heavy chains, T-cell receptor alpha, beta and gamma-chains LTR long terminal repeat mat_peptide mature peptide coding region (does not include stop codon) or signal peptide misc_feature region of biological interest which cannot be described by any other known feature mRNA messenger RNA mutation a related strain has an abrupt, inheritable change in the sequence polyA_signal polyadenylation signal recognition region polyA_site polyadenylation site to which adenine residues are added primer_bind non-covalent primer binding site promoter promoter region involved in transcription initiation protein_bind non-covalent protein binding site on DNA or RNA RBS ribosome binding site rep_origin origin of replication repeat_region region of genome containing repeating units repeat_unit single repeat element rRNA ribosomal RNA S_region switch region of immunoglobulin heavy chains satellite many tandem repeats of a short basic repeating unit sig_peptide signal peptide coding region stem_loop hair-pin loop structure in DNA or RNA STS sequence tagged site TATA_signal eukaryotic promoter element; consensus=TATA(A or T)A(A or T) terminator transcription termination signal transit_peptide transit peptide coding region tRNA transfer RNA V_region variable region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains V_segment variable segment of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains. variation a related strain contains stable mutations from the same gene (e.g., RFLPs, polymorphisms) 3'UTR region at the 3' end of a mature transcript, following the stop codon 5'UTR region at the 5' end of a mature transcript, preceding the initiation -10_signal prokaryotic promoter element, consensus=TAtAaT -35_signal prokaryotic promoter element, consensus=TTGACa or TGTTGACA (Last change: 08-DEC-1998) (Wendy Baker, EMBL nucleotide sequence database curator) Agnes Leyen EMBL Outstation - The European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge CB10 1SD UK DATASUBMISSIONS: +44 1223 494499 datasubs@ebi.ac.uk UPDATES: +44 1223 494499 updates@ebi.ac.uk PERSONAL: +44 1223 494411 leyen@ebi.ac.uk