Class NeedlemanWunsch

  • Direct Known Subclasses:
    SmithWaterman

    public class NeedlemanWunsch
    extends SequenceAlignment
    Needleman and Wunsch defined the problem of global sequence alignments, from the first till the last symbol of a sequence. This class is able to perform such global sequence comparisons efficiently by dynamic programming. If inserts and deletes are equally expensive and as expensive as the extension of a gap, the alignment method of this class does not use affine gap penalties. Otherwise it does. Those costs need four times as much memory, which has significant effects on the run time, if the computer needs to swap.
    Since:
    1.5
    Author:
    Andreas Dräger, Gero Greiner, Mark Schreiber
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected java.lang.String alignment
      The result of a successful alignment as a simple String.
      protected int[][] CostMatrix
      A matrix with the size length(sequence1) times length(sequence2)
      protected Alignment pairalign
      The result of a successful alignment
      protected SubstitutionMatrix subMatrix
      A matrix with the size length(alphabet) times length(alphabet)
    • Constructor Summary

      Constructors 
      Constructor Description
      NeedlemanWunsch​(short match, short replace, short insert, short delete, short gapExtend, SubstitutionMatrix subMat)
      Constructs a new Object with the given parameters based on the Needleman-Wunsch algorithm The alphabet of sequences to be aligned will be taken from the given substitution matrix.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.List<Alignment> alignAll​(SequenceIterator source, SequenceDB subjectDB)  
      Alignment getAlignment​(SymbolList query, SymbolList target)
      This method is good if one wants to reuse the alignment calculated by this class in another BioJava class.
      java.lang.String getAlignmentString()  
      short getDelete()
      Returns the current expenses of a single delete operation.
      int getEditDistance()
      This gives the edit distance according to the given parameters of this certain object.
      short getGapExt()
      Returns the current expenses of any extension of a gap operation.
      short getInsert()
      Returns the current expenses of a single insert operation.
      short getMatch()
      Returns the current expenses of a single match operation.
      short getReplace()
      Returns the current expenses of a single replace operation.
      protected static int min​(int x, int y, int z)
      This just computes the minimum of three integer values.
      int pairwiseAlignment​(SymbolList query, SymbolList subject)
      Global pairwise sequence alignment of two BioJava-Sequence objects according to the Needleman-Wunsch-algorithm.
      static void printAlignment​(java.lang.String align)
      prints the alignment String on the screen (standard output).
      static java.lang.String printCostMatrix​(int[][] CostMatrix, char[] queryChar, char[] targetChar)
      Prints a String representation of the CostMatrix for the given Alignment on the screen.
      void setDelete​(short del)
      Sets the penalty for a delete operation to the specified value.
      void setGapExt​(short ge)
      Sets the penalty for an extension of any gap (insert or delete) to the specified value.
      void setInsert​(short ins)
      Sets the penalty for an insert operation to the specified value.
      void setMatch​(short ma)
      Sets the penalty for a match operation to the specified value.
      void setReplace​(short rep)
      Sets the penalty for a replace operation to the specified value.
      void setSubstitutionMatrix​(SubstitutionMatrix matrix)
      Sets the substitution matrix to be used to the specified one.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • CostMatrix

        protected int[][] CostMatrix
        A matrix with the size length(sequence1) times length(sequence2)
      • subMatrix

        protected SubstitutionMatrix subMatrix
        A matrix with the size length(alphabet) times length(alphabet)
      • pairalign

        protected Alignment pairalign
        The result of a successful alignment
      • alignment

        protected java.lang.String alignment
        The result of a successful alignment as a simple String.
    • Constructor Detail

      • NeedlemanWunsch

        public NeedlemanWunsch​(short match,
                               short replace,
                               short insert,
                               short delete,
                               short gapExtend,
                               SubstitutionMatrix subMat)
        Constructs a new Object with the given parameters based on the Needleman-Wunsch algorithm The alphabet of sequences to be aligned will be taken from the given substitution matrix.
        Parameters:
        match - This gives the costs for a match operation. It is only used, if there is no entry for a certain match of two symbols in the substitution matrix (default value).
        replace - This is like the match parameter just the default, if there is no entry in the substitution matrix object.
        insert - The costs of a single insert operation.
        delete - The expenses of a single delete operation.
        gapExtend - The expenses of an extension of a existing gap (that is a previous insert or delete. If the costs for insert and delete are equal and also equal to gapExtend, no affine gap penalties will be used, which saves a significant amount of memory.
        subMat - The substitution matrix object which gives the costs for matches and replaces.
    • Method Detail

      • setSubstitutionMatrix

        public void setSubstitutionMatrix​(SubstitutionMatrix matrix)
        Sets the substitution matrix to be used to the specified one. Afterwards it is only possible to align sequences of the alphabet of this substitution matrix.
        Parameters:
        matrix - an instance of a substitution matrix.
      • setInsert

        public void setInsert​(short ins)
        Sets the penalty for an insert operation to the specified value.
        Parameters:
        ins - costs for a single insert operation
      • setDelete

        public void setDelete​(short del)
        Sets the penalty for a delete operation to the specified value.
        Parameters:
        del - costs for a single deletion operation
      • setGapExt

        public void setGapExt​(short ge)
        Sets the penalty for an extension of any gap (insert or delete) to the specified value.
        Parameters:
        ge - costs for any gap extension
      • setMatch

        public void setMatch​(short ma)
        Sets the penalty for a match operation to the specified value.
        Parameters:
        ma - costs for a single match operation
      • setReplace

        public void setReplace​(short rep)
        Sets the penalty for a replace operation to the specified value.
        Parameters:
        rep - costs for a single replace operation
      • getInsert

        public short getInsert()
        Returns the current expenses of a single insert operation.
        Returns:
        insert
      • getDelete

        public short getDelete()
        Returns the current expenses of a single delete operation.
        Returns:
        delete
      • getGapExt

        public short getGapExt()
        Returns the current expenses of any extension of a gap operation.
        Returns:
        gapExt
      • getMatch

        public short getMatch()
        Returns the current expenses of a single match operation.
        Returns:
        match
      • getReplace

        public short getReplace()
        Returns the current expenses of a single replace operation.
        Returns:
        replace
      • printCostMatrix

        public static java.lang.String printCostMatrix​(int[][] CostMatrix,
                                                       char[] queryChar,
                                                       char[] targetChar)
        Prints a String representation of the CostMatrix for the given Alignment on the screen. This can be used to get a better understanding of the algorithm. There is no other purpose. This method also works for all extensions of this class with all kinds of matrices.
        Parameters:
        CostMatrix - The matrix that contains all expenses for swapping symbols.
        queryChar - a character representation of the query sequence (mySequence.seqString().toCharArray()).
        targetChar - a character representation of the target sequence.
        Returns:
        a String representation of the matrix.
      • printAlignment

        public static void printAlignment​(java.lang.String align)
        prints the alignment String on the screen (standard output).
        Parameters:
        align - The parameter is typically given by the getAlignmentString() method.
      • getAlignment

        public Alignment getAlignment​(SymbolList query,
                                      SymbolList target)
                               throws java.lang.Exception
        This method is good if one wants to reuse the alignment calculated by this class in another BioJava class. It just performs pairwiseAlignment and returns an Alignment instance containing the two aligned sequences.
        Specified by:
        getAlignment in class SequenceAlignment
        Returns:
        Alignment object containing the two gapped sequences constructed from query and target.
        Throws:
        java.lang.Exception
      • getEditDistance

        public int getEditDistance()
        This gives the edit distance according to the given parameters of this certain object. It returns just the last element of the internal cost matrix (left side down). So if you extend this class, you can just do the following: int myDistanceValue = foo; this.CostMatrix = new int[1][1]; this.CostMatrix[0][0] = myDistanceValue;
        Returns:
        returns the edit_distance computed with the given parameters.
      • min

        protected static int min​(int x,
                                 int y,
                                 int z)
        This just computes the minimum of three integer values.
        Parameters:
        x -
        y -
        z -
        Returns:
        Gives the minimum of three integers
      • alignAll

        public java.util.List<Alignment> alignAll​(SequenceIterator source,
                                                  SequenceDB subjectDB)
                                           throws java.util.NoSuchElementException,
                                                  BioException
        Specified by:
        alignAll in class SequenceAlignment
        Parameters:
        source - a SequenceIterator containing a set of sequences to be aligned with
        subjectDB - the SequenceDB containing another set of sequences.
        Returns:
        a list containing the results of all single alignments performed by this method.
        Throws:
        java.util.NoSuchElementException
        BioException