Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
HIVE-4-MAT is a linked-data, automatic indexing application for vocabularies related to material science. In the past few months, work has been done to improve the performance of the keyword alignment algorithm so that it is faster, more accurate, and more flexible at the expense of precision. This presentation reports on the lessons learned in the process of refactoring this keyword alignment algorithm. Since HIVE-4-MAT has a somewhat broad scope, it provides a good use case for analyzing a keyword alignment pipeline from raw article text scraping to keyword extraction to keyword matching and alignment. The presentation will touch topics such as common pitfalls of web scraping, different strategies for preparing raw text for keyword extraction, the differences in goals between keyword extraction and keyword alignment, and the potential benefits and drawbacks of utilizing the concept of string distance in keyword alignment algorithms.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.