Modifier and Type | Class and Description |
---|---|
static class |
SimpleMilneWitten.Provider |
Constructor and Description |
---|
SimpleMilneWitten(String name,
Language language,
LocalPageDao pageDao,
LocalLinkDao linkDao,
AnchorTextPhraseAnalyzer phraseAnalyzer) |
Modifier and Type | Method and Description |
---|---|
double[][] |
cosimilarity(int[] ids)
Construct symmetric comsimilarity matrix of Wikipedia ids in a given language.
|
double[][] |
cosimilarity(int[] wpRowIds,
int[] wpColIds)
Construct a cosimilarity matrix of Wikipedia ids in a given language.
|
double[][] |
cosimilarity(String[] phrases)
Construct symmetric cosimilarity matrix of phrases by mapping through local pages.
|
double[][] |
cosimilarity(String[] rowPhrases,
String[] colPhrases)
Construct a cosimilarity matrix of phrases.
|
File |
getDataDir()
Returns the directory containing all data for the metric.
|
Language |
getLanguage() |
Normalizer |
getMostSimilarNormalizer() |
String |
getName() |
Normalizer |
getSimilarityNormalizer() |
SRResultList |
mostSimilar(int pageId,
int maxResults)
Find the most similar local pages to a local page within the same language.
|
SRResultList |
mostSimilar(int pageId,
int maxResults,
gnu.trove.set.TIntSet validIds)
Find the most similar local pages to a local page.
|
SRResultList |
mostSimilar(String phrase,
int maxResults)
Find the most similar local pages to a phrase.
|
SRResultList |
mostSimilar(String phrase,
int maxResults,
gnu.trove.set.TIntSet validIds)
Find the most similar local pages to a phrase.
|
boolean |
mostSimilarIsTrained() |
void |
read()
Reads the metric from the current data directory.
|
void |
setDataDir(File dir)
Sets the data directory associated with the model.
|
void |
setMostSimilarNormalizer(Normalizer n)
Sets the most similar normalizer
|
void |
setSimilarityNormalizer(Normalizer n)
Sets the similarity normalizer.
|
SRResult |
similarity(int pageId1,
int pageId2,
boolean explanations)
Determine the similarity between two local pages.
|
SRResult |
similarity(String phrase1,
String phrase2,
boolean explanations)
Determine the similarity between two strings in a given language by mapping through local pages.
|
boolean |
similarityIsTrained() |
void |
trainMostSimilar(Dataset dataset,
int numResults,
gnu.trove.set.TIntSet validIds)
Train the mostSimilar() function
The KnownSims may already be associated with Wikipedia ids (check wpId1 and wpId2).
|
void |
trainSimilarity(Dataset dataset)
Train the similarity() function.
|
void |
write()
Writes the metric to the current data directory.
|
public SimpleMilneWitten(String name, Language language, LocalPageDao pageDao, LocalLinkDao linkDao, AnchorTextPhraseAnalyzer phraseAnalyzer) throws DaoException
DaoException
public String getName()
public Language getLanguage()
getLanguage
in interface SRMetric
public File getDataDir()
SRMetric
getDataDir
in interface SRMetric
public void setDataDir(File dir)
SRMetric
setDataDir
in interface SRMetric
public SRResult similarity(int pageId1, int pageId2, boolean explanations) throws DaoException
SRMetric
similarity
in interface SRMetric
pageId1
- Id of the first page.pageId2
- Id of the second page.explanations
- Whether explanations should be created.DaoException
public SRResult similarity(String phrase1, String phrase2, boolean explanations) throws DaoException
SRMetric
similarity
in interface SRMetric
phrase1
- The first phrase.phrase2
- The second phrase.explanations
- Whether explanations should be created.DaoException
public SRResultList mostSimilar(int pageId, int maxResults) throws DaoException
SRMetric
mostSimilar
in interface SRMetric
pageId
- The id of the local page whose similarity we are examining.maxResults
- The maximum number of results to return.DaoException
public SRResultList mostSimilar(int pageId, int maxResults, gnu.trove.set.TIntSet validIds) throws DaoException
SRMetric
mostSimilar
in interface SRMetric
pageId
- The id of the local page whose similarity we are examining.maxResults
- The maximum number of results to return.validIds
- The local page ids to be considered. Null means all ids in the language.DaoException
public SRResultList mostSimilar(String phrase, int maxResults) throws DaoException
SRMetric
mostSimilar
in interface SRMetric
phrase
- The phrase whose similarity we are examining.maxResults
- The maximum number of results to return.DaoException
public SRResultList mostSimilar(String phrase, int maxResults, gnu.trove.set.TIntSet validIds) throws DaoException
SRMetric
mostSimilar
in interface SRMetric
phrase
- The phrase whose similarity we are examining.maxResults
- The maximum number of results to return.validIds
- The local page ids to be considered. Null means all ids in the languageDaoException
public void write() throws IOException
SRMetric
write
in interface SRMetric
IOException
public void read()
SRMetric
public void trainSimilarity(Dataset dataset) throws DaoException
SRMetric
trainSimilarity
in interface SRMetric
dataset
- A gold standard datasetDaoException
public void trainMostSimilar(Dataset dataset, int numResults, gnu.trove.set.TIntSet validIds)
SRMetric
trainMostSimilar
in interface SRMetric
dataset
- A gold standard dataset.numResults
- The maximum number of similar articles computed per phrase.validIds
- The Wikipedia ids that should be considered in result sets. Null means all ids.public boolean similarityIsTrained()
similarityIsTrained
in interface SRMetric
public boolean mostSimilarIsTrained()
mostSimilarIsTrained
in interface SRMetric
public double[][] cosimilarity(int[] wpRowIds, int[] wpColIds) throws DaoException
SRMetric
cosimilarity
in interface SRMetric
DaoException
public double[][] cosimilarity(String[] rowPhrases, String[] colPhrases) throws DaoException
SRMetric
cosimilarity
in interface SRMetric
DaoException
public double[][] cosimilarity(int[] ids) throws DaoException
SRMetric
cosimilarity
in interface SRMetric
DaoException
public double[][] cosimilarity(String[] phrases) throws DaoException
SRMetric
cosimilarity
in interface SRMetric
DaoException
public Normalizer getMostSimilarNormalizer()
getMostSimilarNormalizer
in interface SRMetric
public void setMostSimilarNormalizer(Normalizer n)
SRMetric
setMostSimilarNormalizer
in interface SRMetric
public Normalizer getSimilarityNormalizer()
getSimilarityNormalizer
in interface SRMetric
public void setSimilarityNormalizer(Normalizer n)
SRMetric
setSimilarityNormalizer
in interface SRMetric
Copyright © 2014. All rights reserved.