Research Projects

2010 - 2011

Information Navigation using Statistical Rhymes

Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (B)  Grant-in-Aid for Young Scientists (B)

Grant number
22700150
Japan Grant Number (JGN)
JP22700150
Grant amount
(Total)
4,030,000 Japanese Yen
(Direct funding)
3,100,000 Japanese Yen
(Indirect funding)
930,000 Japanese Yen

This project is based on the following assumption : Words that co-occur in statistically significant frequency can be used as a guide in useful information navigation system even when those co-occurrences are not based on semantic similarity or relatedness. We call such co-occurrences statistical rhyme. We have been trying to extract statistical rhymes with Bayesian probabilistic models. We consequently succeeded in proposing a new LDA(latent Dirichlet allocation)-like topic extraction method that can give a segmentation of word token sequences appearing in bibliographic data, which we can observe in references section of academic papers or in publications section of researchers' Web sites. Our method split each bibliographic data into the segments each corresponding to different data field, e. g. authors, paper title, journal, pages, publication year, etc. Further, we improved segmentation accuracy by making the inference semi-supervised.

Link information
Kaken Url
https://kaken.nii.ac.jp/file/KAKENHI-PROJECT-22700150/22700150seika.pdf
KAKEN
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-22700150
ID information
  • Grant number : 22700150
  • Japan Grant Number (JGN) : JP22700150