#Please insert up references in the next lines (line starts with keyword UP) UP arb.hlp UP e4.hlp UP pfold.hlp UP glossary.hlp #Please insert subtopic references (line starts with keyword SUB) # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} #************* Title of helpfile !! and start of real helpfile ******** TITLE Protein Match Settings OCCURRENCE ARB_EDIT4/Properties/Protein Match Settings DESCRIPTION In the 'Protein Match Settings' window the protein structure match computation can be configured. The settings are described in the following section. SECTION Configuration Show protein structure match: Toggle display of protein match symbols Selected Protein Structure SAI: The protein secondary structure SAI used as reference for match computation. The default is 'PFOLD'. Filter SAI names for: Via a filter the SAIs shown in the option menu can be narrowed down to a selection of SAIs whose names contain the specified string. This is useful for a great number of SAIs to quickly find the one that should be used. Default is 'pfold'. Match Method: The used method for protein structure match computation. Default is 'Secondary Structure <-> Sequence' which is most probable the method of choice. Details on the different methods can be found below in section 'Description of Match Methods'. Match Symbols (only relevant for the match method 'Secondary Structure <-> Sequence'): Ten symbols that represent the match quality ranging from 0 - 100% in steps of 10%. Take care to enter exactly ten symbols. Note that spaces (' ') are symbols, too. Pair definitions (only relevant for the match methods 'Secondary Structure <-> Secondary Structure' and 'Secondary Structure <-> Sequence (Full Prediction)'). Each line contains two textfields: - The left textfield contains one or more amino acid pairs. Each pair contains two characters (amino acids, gaps-characters, ...). Pairs are separated by spaces (' '). - The right textfield contains the match symbol used for each of the specified pairs. SECTION Description of Match Methods Match Method 'Secondary Structure <-> Secondary Structure' Use this method if you want to compare protein secondary structures only. The characters representing species secondary structures are compared one by one with the ones of the selected secondary structure SAI using the pair definitions and the defined match symbols. If undefined pairs are encountered the 'Unknown_match' symbol is displayed. Match Method 'Secondary Structure <-> Sequence' Species amino acid sequences are compared with the selected secondary structure SAI by taking cohesive parts of the structure - gaps in the alignment are skipped - and computing values from 0 - 100% (in steps of 10%) for the match quality which are mapped to the defined match symbols. The whole part is marked with that symbol. Note that bends ('S') are assumed to fit everywhere (=> best match symbol), and if a structure is encountered but no corresponding amino acid the worst match symbol is displayed. Match Method 'Secondary Structure <-> Sequence (Full Prediction)' Species amino acid sequences are compared with the selected secondary structure SAI using a full prediction of secondary structures from their sequences (via the Chou-Fasman algorithm) and comparing the characters one by one with the reference structure SAI. Note that not the structure summaries are used for comparison, but individually predicted alpha-helices ('H'), beta-sheets ('E') and beta-turns ('T'). The pair definitions are searched in ascending order, i.e. good matches first, then the worse ones. If a match is found the corresponding match symbol is displayed. Note that if a structure is encountered but no corresponding amino acid the worst match symbol is displayed. NOTES - The menu entry 'Properties -> Protein Match Settings' is only shown for protein alignments ('Alignment Information -> : pro', see LINK{ad_align.hlp}). - The match computation for sequences and secondary structures is based on the Chou-Fasman algorithm or adaptions to it. See LINK{pfold.hlp} for explanation and reference. SECTION TODO The settings window should only show the fields that are relevant for the current match method. EXAMPLES None WARNINGS !!! The match computation can only give a rough overview if a given amino acid sequence matches a certain secondary structure. Do not fully rely on it but use it as hints for aligning your amino acid sequences. !!! !!! The match method 'Secondary Structure <-> Sequence (Full Prediction)' is experimental. It is probably not very reliable and requires a lot of computation. Thus, it should not be used for a large number of species loaded in the editor. !!! BUGS The editor might be unstable and can crash if sequences are not formatted.