'STRAP:multiple sequence alignments '

charite.christo.strap
Class StrapProtein

java.lang.Object
  extended by charite.christo.protein.Protein
      extended by charite.christo.protein.ProteinNT
          extended by charite.christo.strap.StrapProtein
All Implemented Interfaces:
DialogStringMatch.Interface, ChRunnable, HasClientProperty, HasDndFiles, HasImage, HasName, HasProtein, HasRendererComponent, ActionListener, Comparator, EventListener
Direct Known Subclasses:
BiojavaSequence2StrapProtein

public class StrapProtein
extends ProteinNT
implements ChRunnable, HasRendererComponent, HasImage, HasDndFiles

The class StrapProtein describes the protein objects in Strap and is important for users who write plugins for STRAP. It is an extension of JAVADOC:ProteinNT because it allows proteins to be read from nucleotide files. In addition it allows gaps which is the basis for sequence alignments.

get-methods in StrapProtein

Methods starting with the prefix getResidue return information associated with amino acids. They come in two versions. Without parameter they return an array with at least as many elements as there are amino acids or null if the information is not available. With a residue index as parameter they return the information for the particular residue. If the information is not available -1 or NaN or null is returned.

What you should never do

The java-language allows to change elements within arrays but you should not change elements in arrays which are obtained by using a get method. To change information one should always use the appropriate setters such as JAVADOC:Protein#setResidueType(byte[]). Never use the field .length of arrays to obtain the number of amino acids. Use the method JAVADOC:Protein#countResidues() instead.

Here are some frequently used get-methods:

Caching information:

Since STRAP allows to work with hundreds of proteins simultaneously computation must be kept to a minimum. Computing one and the same several times must be avoided. Each method has an input and a return value. When a method is called a second time with the same input it can return the previous result. Therefore a computational intensive method should have a memory where the last result is kept. If the input is an array it would be difficult to test whether all elements in this array are still the same. For that reason methods with the suffix modificationCount return a version number of the information. The modification count is a non-negative value that increases (by 1 or perhaps by more) as this return value changed. The initial value is unspecified. This version number is incremented when elements of the array change. A method simply needs to determine whether the modificationCount has changed since it was called last time. For example a method which depends on the amino acid sequence obtained with JAVADOC:Protein#countResidues() can retrieve the version with JAVADOC:Protein#getResidueType(). Depending on whether this version number changed it can repeat the calculation or just return the last result.


Field Summary
static String METHOD_PARSE
           
static StrapProtein[] NONE
           
 
Fields inherited from class charite.christo.protein.ProteinNT
AMINO_ACIDS, FORWARD, FORWARD_COMPLEMENT, NUCLEOTIDES, ORIENTATIONS, REVERSE, REVERSE_COMPLEMENT
 
Fields inherited from class charite.christo.protein.Protein
CDS_GENE, CDS_NOTE, CDS_PRODUCT, CDS_PROTEIN, CDS_XREFS, clientObjects_, FILE_MASKS, mANNO, mDNA, mGAPS, mICON, MODI_COUNTS, MODI_COUNTS_ALIGNMENT, mTIPS, mTRANS, PARSING_INFO, vProteinViewers, vResidueSelection, WEAK_REF
 
Fields inherited from interface charite.christo.interfaces.ChRunnable
APPEND, DOWNLOAD_FINISHED, INTERPRET_LINE, LOG, SET_ICON_IMAGE
 
Constructor Summary
StrapProtein()
           
StrapProtein(ProteinAlignment a)
           
 
Method Summary
 void addResidueSelection(ResidueSelection s)
          Add a selection of residues
 int column2index(int col)
          Get index of residue at the horizontal alignment position col.
 int column2nextIndex(int h)
          Get index of residue at the horizontal alignment position h or the next index if h points to a gap.
 int column2thisOrPreviousIndex(int col)
          Get index of residue at the horizontal alignment position col.
 int[] columns2indices()
          Get indeces of residues at horizontal alignment positions col.
 int[] columns2nextIndices()
           
 ByteArray cursorPosAsString(int ias, boolean showColumn)
           
 void dispose()
           
 ProteinAlignment getAlignment()
          Get The alignment object.
 ResidueSelection[] getAllResidueSelections()
           
 File[] getDndFiles()
           
 byte[] getGappedSequence()
          The amino acid sequence with gaps (0x20).
 String getGappedSequenceAsString()
          The amino acid sequence.
 String getGappedSequenceAsStringUC()
          The amino acid sequence.
 byte[] getGappedSequenceExactLength()
          The amino acid sequence.
 byte[] getGappedSequences()
           
static byte[][] getGappedSequences(StrapProtein[] pp, int colFrom, int colTo, char gap)
           
 byte[] getGappedSequenceTS()
          To be run not in EDT
 ImageIcon getIcon()
           
 Image getIconImage()
           
 String getIconUrl()
           
 Image getImage(Component observer)
          Get the icon of the protein or null
 String getImageId()
           
 int getMaxColumn()
          Get the horizontal position of the last residue..
static int getMaxColumn(StrapProtein[] pp)
           
static long getParserOptions()
           
static StrapProtein[] getProteinsWithNames(ByteArray ba, StrapProtein[] pp)
           
static StrapProtein getProteinWithName(String name, Protein[] pp0)
          returns the 1st protein whose name equals to name.
static StrapProtein getProteinWithNameAndFile(String n, File file, Object[] pp)
           
 ReferenceSequence getReferenceSequence(String otherId, ChRunnable log, boolean background, Runnable runWhenFinished)
           
 JComponent getRendererComponent()
           
 StringBuffer getRendererText(int options)
           
 ResidueAnnotation[] getResidueAnnotations()
           
 ResidueAnnotation getResidueAnnotationWithName(CharSequence name)
           
 int[] getResidueColumn()
          Maps residue indices upon horizontal text positions of the alignment.
 int getResidueColumn(int i)
          Maps residue indices to horizontal text positions of the alignment.
 int[] getResidueGap()
           
 int getResidueGap(int i)
           
 ResidueSelection[] getResidueSelectionsAt(int resIdxFrom, int resIdxTo)
           
 ResidueSelection[] getResidueSelectionsAt(int iA_from, int iA_to, int where)
           
 boolean[] getResiduesInAlignmentcolumnrangeAsBoolean(int fromColumn, int toColumn)
           
 byte[] getSelectedAminoacids()
          Returns an array telling what residues are selected by at least one residue selection.
 byte[] getSelectedNucleotides()
           
 JComponent getVerticalRendererComponent()
           
static void inferCoordinates_BG(StrapProtein[] pp, String[] pdbID, long options, Runnable whenFinished)
           
static void inferCoordinates_EDT(StrapProtein[] pp, String[] pdbID, long options)
           
 void inferGapsFromGappedSequence(byte[] seq)
          The original residue types are kept but the gaps are inferred from the gapped sequence seq.
 boolean isTransient()
           
static StrapProtein[] loadProteinsInList(ByteArray ba, long options)
           
static StrapProtein[] loadProteinsInList(File f, long options)
           
static StrapProtein newInstance(File proteinFile)
           
static StrapProtein newInstance(File proteinFile, long options)
          creates Protein from a protein file The file type may be PDB, SWISSPROT, EMBL, FASTA The method is not thread safe because it uses a static byte buffer for file data.
 boolean parse(ByteArray txt, ProteinParser[] parsers, long mode)
          Interpretes the text and extracts the protein data and sets the fields such as residueType in the protein.
 void removeResidueSelection(Class c)
          Remove all selection of residues that are instances of c
 void removeResidueSelection(ResidueSelection s)
          Remove selection of residues
 Object run(String id, Object arg)
           
 void save(File dir, StringBuffer error)
           
 void setAlignment(ProteinAlignment a)
           
 void setGappedSequence(byte[] seq, int len)
          set the gapped sequence
 void setGappedSequence(CharSequence seq)
          See JAVADOC:StrapProtein#setGappedSequence(byte[])
 void setIconImage(String url)
           
 void setInfoAssociatedPdb(String info)
           
static void setParserOptions(long options)
           
 void setResidueGap(int[] rg)
           
 void setResidueGap(int iA, int gap)
           
 void setTransient(boolean b)
          Non-transient proteins are not saved
 void texshade(String txt)
           
 
Methods inherited from class charite.christo.protein.ProteinNT
aminoAcidIndex2nucleotideIndex, applyCDS_expression, areNucleotidesCurrentStrandTranslated, cdsExpressionApplied, countCodingNucleotides, countNucleotides, getNucleotide, getNucleotideCurrentStrand, getNucleotides, getNucleotidesAsString, getNucleotidesAsStringUC, getNucleotidesCurrentStrand, getNucleotidesCurrentStrandAsString, getNucleotidesCurrentStrandExactLength, getNucleotidesReverseComplementAsStringUC, getResidueTriplet, getResidueTypeFullLength, isComplement, isNucleotideCurrentStrandTranslated, isReverse, nucleotideIdx2translatedNucleotideIdx, nucleotideIndex2aminoAcidIndex, nucleotideIndices2translatedNucleotideIndices, nucleotideSelection2aminoAcidSelection, parseFrameShift, setNucleotideCurrentStrandTranslated, setNucleotides, setNucleotides, setNucleotidesCurrentStrandTranslated, setTranslatedStrand, translatedNucleotideIndex2nucleotideIndex, translatedNucleotideIndices2nucleotideIndices
 
Methods inherited from class charite.christo.protein.Protein
actionPerformed, addCDS, addDatabaseRef, addHeteroCompounds, addHeteroCompoundsUniq, addProteinViewer, addSequenceRef, addThreeLetterCode, atomNumber2residueIdx, commonPdbId, comparator_aminoacidSequenceAlphabetically, compare, containsHeteroCompound, containsOnlyACTGNX, countAtoms, countResidues, countResiduesInChain, equalsResidueType, fileAllChains, geneProductProteinNoteForCDS, getAccessionID, getAssertEDT, getAtomBFactor, getAtomCoordinates_modificationCount, getAtomCoordinates, getAtomCoordinates, getAtomName32, getAtomName32, getAtomNumber, getAtomOccupancy, getAtomType, getBioMatrices, getCalphaIndexToResidueIndex, getCDS, getChainFirstResidueIdx, getChainLastResidueIdx, getChainsAsString, getCharacters, getClientProperty, getClientPropertyMapAsString, getCompound, getDatabaseRefs, getDnaAndRnaAndHeteros, getDnaAndRnaStructures, getEC, getFile, getFileAsString, getFileHasSidechainAtoms, getFileLastModified, getFileMayBeWithSideChains, getFileName, getFileNameHC, getFilePath, getFileWithAllChains, getFirstResidueIndex, getGappedSequence_modificationCount, getHaystacksForStringSearch, getHeader, getHeteroCompounds, getInferred3dCountMatches, getInferredPdbID, getInfo, getLastResidueIndex, getMouseOverSelection, getName, getName0, getNameHC, getNucleotides_modificationCount, getOnlyChains, getOrganism, getOrganismScientific, getParsingTime, getPdbID, getPdbRef, getPdbTextSecStru, getProtein, getProteinParserClass, getProteinViewer, getProteinViewers, getResidueAnglePsi, getResidueAtomNumber, getResidueAtomNumber, getResidueCalpha, getResidueCalpha, getResidueCalpha, getResidueChain, getResidueChain, getResidueFirstAtomIdx, getResidueFirstAtomIdx, getResidueInsertionCode, getResidueInsertionCode, getResidueLastAtomIdx, getResidueLastAtomIdx, getResidueName32, getResidueName32, getResidueNumber, getResidueNumber, getResidueSecStrType, getResidueSecStrType, getResidueSelections_modificationCount, getResidueSelections, getResidueSolventAccessibility, getResidueSubsetAsBooleanArray, getResidueType_modificationCount, getResidueType, getResidueType, getResidueType, getResidueTypeAsString, getResidueTypeAsStringUC, getResidueTypeExactLength, getResidueTypeHashCode, getResolutionAnstroms, getRotationAndTranslation, getSequenceRefs, getTitle, getUniprotID, getURL, hasCalpha, hashTMalign, inferCoordinates, isInMsfFile, isLoadedFromStructureFile, isWaterInFile, loadSideChainAtoms, mouseOver, paintChain, pdbNumberToIdx, putClientProperty, removeAllSequenceRefs, removeHeteroCompound, removeProteinViewer, removeSequenceRef, residueNumberAndChain2residueIdx, residueSubsetAsBooleanArray, residueSubsetAsBooleanArray, selectedPositionsToText, setAccessionID, setAssertEDT, setAtomBFactor, setAtomCoordOriginal, setAtomNumber, setAtomOccupancy, setAtomType, setAtomType32bit, setBioMatrices, setCharSequence, setCompound, setEC, setFile, setFileHasSidechainAtoms, setFileLastModified, setFileWithAllChains, setFileWithSideChains, setFileWithSideChainsObtainableFromPDB, setFirstResidueIndex, setHeader, setIsInMsfFile, setIsLoadedFromStructureFile, setName, setOnlyChains, setOrganism, setOrganismScientific, setParsingTime, setPdbID, setPdbTextSecStru, setProteinParserClass, setResidueAnglePhi, setResidueAnglePsi, setResidueAtomNumber, setResidueCalphaOriginal, setResidueChain, setResidueFirstAndLastAtomIdx, setResidueInsertionCode, setResidueNumber, setResidueSecStrType, setResidueSolventAccessibility, setResidueSubset, setResidueType, setResidueType, setResidueType32bit, setResidueTypeToUpperOrLower, setResolutionAnstroms, setRotationAndTranslation, setTitle, setUniprotID, setURL, strapFile, toOneLetterCode, toOneLetterCode, toString, toThreeLetterCode
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.Comparator
equals
 

Field Detail

METHOD_PARSE

public static final String METHOD_PARSE
See Also:
Constant Field Values

NONE

public static final StrapProtein[] NONE
Constructor Detail

StrapProtein

public StrapProtein(ProteinAlignment a)

StrapProtein

public StrapProtein()
Method Detail

newInstance

public static StrapProtein newInstance(File proteinFile,
                                       long options)
creates Protein from a protein file The file type may be PDB, SWISSPROT, EMBL, FASTA The method is not thread safe because it uses a static byte buffer for file data.


newInstance

public static StrapProtein newInstance(File proteinFile)

getResiduesInAlignmentcolumnrangeAsBoolean

public boolean[] getResiduesInAlignmentcolumnrangeAsBoolean(int fromColumn,
                                                            int toColumn)

dispose

public void dispose()

getAlignment

public final ProteinAlignment getAlignment()
Get The alignment object.

Overrides:
getAlignment in class Protein

setAlignment

public final void setAlignment(ProteinAlignment a)

setResidueGap

public void setResidueGap(int[] rg)

setResidueGap

public void setResidueGap(int iA,
                          int gap)

getResidueGap

public int[] getResidueGap()

getResidueGap

public final int getResidueGap(int i)

getResidueColumn

public int[] getResidueColumn()
Maps residue indices upon horizontal text positions of the alignment.

Returns:
an integer-array of all horizontal positions for each residue

getResidueColumn

public final int getResidueColumn(int i)
Maps residue indices to horizontal text positions of the alignment.


getMaxColumn

public int getMaxColumn()
Get the horizontal position of the last residue..


columns2nextIndices

public int[] columns2nextIndices()

columns2indices

public final int[] columns2indices()
Get indeces of residues at horizontal alignment positions col.

Returns:
The residue index or -1 for an alignment position (column).

column2index

public int column2index(int col)
Get index of residue at the horizontal alignment position col.

Overrides:
column2index in class Protein
Parameters:
col - the alignment position (column)
Returns:
The residue index or -1 for an alignment position (column).

column2thisOrPreviousIndex

public final int column2thisOrPreviousIndex(int col)
Get index of residue at the horizontal alignment position col. if h points to a gap return the previous index.

Parameters:
col - the alignment position (column)
Returns:
The residue index or -1 for an alignment position (column).

column2nextIndex

public final int column2nextIndex(int h)
Get index of residue at the horizontal alignment position h or the next index if h points to a gap.

Parameters:
h - the alignment position (column)
Returns:
the index of the residue at alignment position

getGappedSequenceTS

public byte[] getGappedSequenceTS()
To be run not in EDT


getGappedSequence

public byte[] getGappedSequence()
The amino acid sequence with gaps (0x20). Array may be longer than getMaxColumn().In this case the end is marked with 0x00.


getGappedSequences

public static byte[][] getGappedSequences(StrapProtein[] pp,
                                          int colFrom,
                                          int colTo,
                                          char gap)

getGappedSequences

public byte[] getGappedSequences()

getGappedSequenceExactLength

public byte[] getGappedSequenceExactLength()
The amino acid sequence. Gaps are character '-'.


getGappedSequenceAsString

public String getGappedSequenceAsString()
The amino acid sequence. Gaps are character '-'.


getGappedSequenceAsStringUC

public String getGappedSequenceAsStringUC()
The amino acid sequence. Gaps are character '-'.


inferGapsFromGappedSequence

public void inferGapsFromGappedSequence(byte[] seq)
The original residue types are kept but the gaps are inferred from the gapped sequence seq. A letter denotes an amino acid, any other character a gap.


setGappedSequence

public void setGappedSequence(byte[] seq,
                              int len)
set the gapped sequence


setGappedSequence

public void setGappedSequence(CharSequence seq)
See JAVADOC:StrapProtein#setGappedSequence(byte[])


getDndFiles

public File[] getDndFiles()
Specified by:
getDndFiles in interface HasDndFiles

save

public void save(File dir,
                 StringBuffer error)

isTransient

public boolean isTransient()

setTransient

public void setTransient(boolean b)
Non-transient proteins are not saved


parse

public boolean parse(ByteArray txt,
                     ProteinParser[] parsers,
                     long mode)
Interpretes the text and extracts the protein data and sets the fields such as residueType in the protein.

Parameters:
parsers - An array of parsers to do the job. Try all until one succeeds.
txt - the protein text in fasta or pdb or swissprot format.
Returns:
true on success

getParserOptions

public static long getParserOptions()

setParserOptions

public static void setParserOptions(long options)

getImage

public Image getImage(Component observer)
Get the icon of the protein or null

Specified by:
getImage in interface HasImage

getIcon

public ImageIcon getIcon()

getIconImage

public Image getIconImage()
Overrides:
getIconImage in class Protein

setIconImage

public void setIconImage(String url)

getIconUrl

public String getIconUrl()

getImageId

public String getImageId()

setInfoAssociatedPdb

public void setInfoAssociatedPdb(String info)

getRendererComponent

public JComponent getRendererComponent()
Specified by:
getRendererComponent in interface HasRendererComponent

getVerticalRendererComponent

public JComponent getVerticalRendererComponent()

getRendererText

public final StringBuffer getRendererText(int options)

texshade

public void texshade(String txt)

getSelectedAminoacids

public byte[] getSelectedAminoacids()
Returns an array telling what residues are selected by at least one residue selection. 0 element means not selected any other number means selected.


getSelectedNucleotides

public byte[] getSelectedNucleotides()

getAllResidueSelections

public ResidueSelection[] getAllResidueSelections()
Overrides:
getAllResidueSelections in class Protein

removeResidueSelection

public void removeResidueSelection(Class c)
Remove all selection of residues that are instances of c


addResidueSelection

public void addResidueSelection(ResidueSelection s)
Description copied from class: Protein
Add a selection of residues

Overrides:
addResidueSelection in class Protein

removeResidueSelection

public void removeResidueSelection(ResidueSelection s)
Description copied from class: Protein
Remove selection of residues

Overrides:
removeResidueSelection in class Protein

cursorPosAsString

public ByteArray cursorPosAsString(int ias,
                                   boolean showColumn)
Overrides:
cursorPosAsString in class Protein

getResidueSelectionsAt

public ResidueSelection[] getResidueSelectionsAt(int resIdxFrom,
                                                 int resIdxTo)

getResidueSelectionsAt

public ResidueSelection[] getResidueSelectionsAt(int iA_from,
                                                 int iA_to,
                                                 int where)
Overrides:
getResidueSelectionsAt in class Protein

getResidueAnnotationWithName

public ResidueAnnotation getResidueAnnotationWithName(CharSequence name)

getResidueAnnotations

public ResidueAnnotation[] getResidueAnnotations()
Returns:
All annotations of the protein

getReferenceSequence

public ReferenceSequence getReferenceSequence(String otherId,
                                              ChRunnable log,
                                              boolean background,
                                              Runnable runWhenFinished)

loadProteinsInList

public static StrapProtein[] loadProteinsInList(File f,
                                                long options)

loadProteinsInList

public static StrapProtein[] loadProteinsInList(ByteArray ba,
                                                long options)

getProteinsWithNames

public static final StrapProtein[] getProteinsWithNames(ByteArray ba,
                                                        StrapProtein[] pp)

getProteinWithName

public static StrapProtein getProteinWithName(String name,
                                              Protein[] pp0)
returns the 1st protein whose name equals to name.


getProteinWithNameAndFile

public static StrapProtein getProteinWithNameAndFile(String n,
                                                     File file,
                                                     Object[] pp)

getMaxColumn

public static int getMaxColumn(StrapProtein[] pp)

run

public Object run(String id,
                  Object arg)
Specified by:
run in interface ChRunnable
Overrides:
run in class ProteinNT

inferCoordinates_BG

public static void inferCoordinates_BG(StrapProtein[] pp,
                                       String[] pdbID,
                                       long options,
                                       Runnable whenFinished)

inferCoordinates_EDT

public static void inferCoordinates_EDT(StrapProtein[] pp,
                                        String[] pdbID,
                                        long options)

'STRAP:multiple sequence alignments '

'The most important classes are StrapAlign, StrapProtein and StrapEvent.'