public interface SRMetric
Modifier and Type | Method and Description |
---|---|
double[][] |
cosimilarity(int[] ids)
Construct symmetric comsimilarity matrix of Wikipedia ids in a given language.
|
double[][] |
cosimilarity(int[] wpRowIds,
int[] wpColIds)
Construct a cosimilarity matrix of Wikipedia ids in a given language.
|
double[][] |
cosimilarity(String[] phrases)
Construct symmetric cosimilarity matrix of phrases by mapping through local pages.
|
double[][] |
cosimilarity(String[] rowPhrases,
String[] colPhrases)
Construct a cosimilarity matrix of phrases.
|
File |
getDataDir()
Returns the directory containing all data for the metric.
|
Language |
getLanguage() |
Normalizer |
getMostSimilarNormalizer() |
String |
getName() |
Normalizer |
getSimilarityNormalizer() |
SRResultList |
mostSimilar(int pageId,
int maxResults)
Find the most similar local pages to a local page within the same language.
|
SRResultList |
mostSimilar(int pageId,
int maxResults,
gnu.trove.set.TIntSet validIds)
Find the most similar local pages to a local page.
|
SRResultList |
mostSimilar(String phrase,
int maxResults)
Find the most similar local pages to a phrase.
|
SRResultList |
mostSimilar(String phrase,
int maxResults,
gnu.trove.set.TIntSet validIds)
Find the most similar local pages to a phrase.
|
boolean |
mostSimilarIsTrained() |
void |
read()
Reads the metric from the current data directory.
|
void |
setDataDir(File dir)
Sets the data directory associated with the model.
|
void |
setMostSimilarNormalizer(Normalizer n)
Sets the most similar normalizer
|
void |
setSimilarityNormalizer(Normalizer n)
Sets the similarity normalizer.
|
SRResult |
similarity(int pageId1,
int pageId2,
boolean explanations)
Determine the similarity between two local pages.
|
SRResult |
similarity(String phrase1,
String phrase2,
boolean explanations)
Determine the similarity between two strings in a given language by mapping through local pages.
|
boolean |
similarityIsTrained() |
void |
trainMostSimilar(Dataset dataset,
int numResults,
gnu.trove.set.TIntSet validIds)
Train the mostSimilar() function
The KnownSims may already be associated with Wikipedia ids (check wpId1 and wpId2).
|
void |
trainSimilarity(Dataset dataset)
Train the similarity() function.
|
void |
write()
Writes the metric to the current data directory.
|
String getName()
Language getLanguage()
File getDataDir()
void setDataDir(File dir)
dir
- SRResult similarity(int pageId1, int pageId2, boolean explanations) throws DaoException
pageId1
- Id of the first page.pageId2
- Id of the second page.explanations
- Whether explanations should be created.DaoException
SRResult similarity(String phrase1, String phrase2, boolean explanations) throws DaoException
phrase1
- The first phrase.phrase2
- The second phrase.explanations
- Whether explanations should be created.DaoException
SRResultList mostSimilar(int pageId, int maxResults) throws DaoException
pageId
- The id of the local page whose similarity we are examining.maxResults
- The maximum number of results to return.DaoException
SRResultList mostSimilar(int pageId, int maxResults, gnu.trove.set.TIntSet validIds) throws DaoException
pageId
- The id of the local page whose similarity we are examining.maxResults
- The maximum number of results to return.validIds
- The local page ids to be considered. Null means all ids in the language.DaoException
SRResultList mostSimilar(String phrase, int maxResults) throws DaoException
phrase
- The phrase whose similarity we are examining.maxResults
- The maximum number of results to return.DaoException
SRResultList mostSimilar(String phrase, int maxResults, gnu.trove.set.TIntSet validIds) throws DaoException
phrase
- The phrase whose similarity we are examining.maxResults
- The maximum number of results to return.validIds
- The local page ids to be considered. Null means all ids in the languageDaoException
void write() throws IOException
IOException
void read() throws IOException
IOException
void trainSimilarity(Dataset dataset) throws DaoException
dataset
- A gold standard datasetDaoException
void trainMostSimilar(Dataset dataset, int numResults, gnu.trove.set.TIntSet validIds)
dataset
- A gold standard dataset.numResults
- The maximum number of similar articles computed per phrase.validIds
- The Wikipedia ids that should be considered in result sets. Null means all ids.boolean similarityIsTrained()
boolean mostSimilarIsTrained()
double[][] cosimilarity(int[] wpRowIds, int[] wpColIds) throws DaoException
wpRowIds
- wpColIds
- IOException
DaoException
double[][] cosimilarity(String[] rowPhrases, String[] colPhrases) throws DaoException
rowPhrases
- colPhrases
- IOException
DaoException
double[][] cosimilarity(int[] ids) throws DaoException
ids
- IOException
DaoException
double[][] cosimilarity(String[] phrases) throws DaoException
phrases
- IOException
DaoException
Normalizer getMostSimilarNormalizer()
void setMostSimilarNormalizer(Normalizer n)
n
- Normalizer getSimilarityNormalizer()
void setSimilarityNormalizer(Normalizer n)
n
- Copyright © 2014. All rights reserved.