#Please insert up references in the next lines (line starts with keyword UP)
UP      arb.hlp
UP      glossary.hlp
UP      pt_server.hlp

#Please insert subtopic references  (line starts with keyword SUB)

# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}

#************* Title of helpfile !! and start of real helpfile strunk ********
TITLE		Probe Collection Matching

OCCURRENCE	ARB_NT/Probes/Match Probes with Specificity


DESCRIPTION

		Searches for potential probe target sites within the sequence
		entries of the corresponding 'PT_SERVER' (not the current)
		database from a list of named probes in a probe collection. Matching
		is performed with the maximum allowable number of mismatches for
		each given probe.

		Probe collections are created or loaded with the PROBE COLLECTION
		window. Probes must be named with unique names otherwise the
		match results cannot be reconcilled correctly. Similarly, species
		within the tree must also have unique names for the match processing
		to work correctly.

		To add a probe to the collection enter the 'Target String' and the
		'Probe Name' and press the ADD button. To remove a probe from the
		collection select the probe from the 'Probes' list and press the
		REMOVE button. Pressing the 'FORGET' button will remove all probes
		from the collection and start a fresh.

		To save the probe collection to file press the 'SAVE' button and
		to load a previously created collection press the 'LOAD' button.
		Probe collections are stored in a simple XML format so they can
		be easily created with an external text editor. The file format
		is detailed below.

		The 'Match Weighting' matrix specifies how mismatch penalities will
		be alloted to sequence mismatches. The 'Positional Weighting'
		parameters adjust the mismatch penalities according to position
		through the following equations:

		  S = -ln(10) / 'Width'

		  P = (((2.0 * 'position') - 'length') / 'length') - 'Bias'

		  Weight = exp(S * P * P)

		where 'position' is the sequence position and 'length' is the probe
		length. weighting function gives a bell curve shape whose spread
		is controlled by the 'Width' parameter, centre is controlled
		by the 'Bias' parameter and whose maximum is one.

		For the default values of 1 and 0 for 'Width' and 'Bias' respectively
		the weighting function has a value of 1 for a position that is half
		the probe length and 0.1 at the zeroth position and the probe length
		position.

		The 'Match Weighting' and 'Positional Weighting' parameters are
		saved as part of the probe collection XML file.

		The MATCH PROBES WITH SPECIFICITY window is used to perform probe
		collection matching. The 'Probes' list shows the probes in the
		probe collection to be matched. If the list is empty you can click
		on the EDIT button to open the PROBE COLLECTION window and create
		or open a probe collection. The CLEAR button clears any previous
		match results but leaves the probe collection in tact.

		You need to select a 'PT_SERVER' from the menu displayed after
		pressing the 'PT_SERVER' button before you can carry out a probe
		collection match. Press the MATCH button to carry out the match
		operation. When the match is complete the number of matches found
		will be displayed and the complete list of match results can be
		viewed by pressing the RESULTS button. Be warned that with large
		probe collections this can be a very large text file.

		Match results are displayed in the DENDROGRAM view using a series
		of vertical bars (one bar per probe) on the left hand side indicating
		regions in the tree where matches occur. Left mouse clicking on the
		bar will open a status message telling you which probe the bar
		corresponds to.

SECTION 	Match display control

		What constitutes a match is controlled by the MATCH DISPLAY CONTROL
		parameters in the MATCH DISPLAY CONTROL window. The controls
		allow you to test, in real time, the match performance of probe
		collections without having to re-run the time consuming match operation.

		The 'Mismatch threshold' slider controls the threshold level that
		dictates whether a partial match will be regarded as a match or a
		mismatch. The scale of the 'Mismatch threshold' spans the range
		from zero to the maximum match weight for the found match results.

		The 'Clade marked threshold' slider controls the threshold level
		(between 0 and 100%) that governs whether a clade is marked as
		matched. For example, if the slider was set to 70% it would indicate
		that at least 70% of species within a clade must match to the degree
		dictated by the 'Mismatch threshold' before the clade is marked as
		matched.

		In a similar manner, the 'Clade partially marked threshold' slider
		controls the threshold level (between 0 and 100%) that governs whether
		a clade is marked as partially matched. Partial clade matches are
		indicated with a stippled bar whereas for a full match the bar is
		solid.

                More options are available via 'Marker display settings'
                (see LINK{nt_tree_marker_settings.hlp}).

SECTION         Display interaction

                  Click (and drag) on a marker shown in tree display, to
                  display its name and to select the corresponding probe
                  in the probe selection list.

SECTION PROBE COLLECTION XML

        <?xml version="1.0" encoding="UTF-8" standalone="no"?>
        <!DOCTYPE probe_collection>
        <probe_collection name="">
            <probe_list>
                <probe seq="AGGUCACACCCGUUCCCA" name="probe1"/>
                <probe seq="AGGUCACACCCGUUCCCG" name="probe2"/>
                <probe seq="AGGUCACACCCGUUCCCT" name="probe3"/>
                    .
                    .
                    .
            </probe_list>
            <match_weighting width="1" bias="0">
                <penalty_matrix values="0 1 1 2 1 0 1 1 1 1 0 1 2 1 1 0"/>
            </match_weighting>
        </probe_collection>


        The penalty matrix values follow row major ordering.

NOTES

		The 'PT_SERVER' database ('*.arb' and '*.arb.pt') stored in
		'$ARBHOME/lib/pts' is used for probe target searching not the
		current database.

		The 'PT_SERVER' database has to be updated ('ARB_NT/Probes/Probe
		Admin')	if species entries should be considered for probe target
		searching which have been added or modified (sequence symbols)
		later than the date of the most recent 'PT_SERVER' database
		update.

		Probe target searching does not depend on correctly aligned
		sequences and is not affected by any modifications of database
		entries except changes of sequence residues.


EXAMPLES

        None


WARNINGS

		Take care to ensure that all probes in the probe collection and all
		species in the current database are uniquely named. Not doing so
		will result in results not being displayed correctly.

BUGS

        No bugs known