org.datamanager.clustering.matrix
Interface DocumentTermMatrix

All Known Implementing Classes:
DocumentTermMatrixImpl

public interface DocumentTermMatrix

This interface marks all matrices that maintain raw document X term occurrences

Version:
$Revision: 1.10 $
Author:
Team Helium

Method Summary
 void addWords(Entity entity, WordFrequencyMapEntityValue newWords)
          Adds the (word, frequency) pairs in newWords associated with the given document.
 double getSimilarityBetween(Entity a, Entity b)
          Returns the similarity between the two documents, which should be between 0 and 1.
 ClusterableMatrix toClusterableMatrix()
          Returns a ClusterableMatrix of the documents in this matrix.
 

Method Detail

toClusterableMatrix

public ClusterableMatrix toClusterableMatrix()
Returns a ClusterableMatrix of the documents in this matrix.


addWords

public void addWords(Entity entity,
                     WordFrequencyMapEntityValue newWords)
Adds the (word, frequency) pairs in newWords associated with the given document.


getSimilarityBetween

public double getSimilarityBetween(Entity a,
                                   Entity b)
Returns the similarity between the two documents, which should be between 0 and 1. 1 implies that the two documents are identical.



See the Helium Website