Loading wiki pages...

Wiki Version:
<p>Library transactional data from chat transactions and subject metadata in checkout clusters represent hugely untapped areas for innovation. Two recent projects at a research library have highlighted the applicability of machine learning methods to reveal trends in large sets of library transactional data. This presentation will detail the machine learning methods utilized for two recent research projects, an account based recommender service and data mining chat transactions for sentiment analysis. A contention of this talk is that research library systems hold vast stores of use data whose size precludes regular analysis through traditional manual methods or basic search queries. Machine learning offers great potential to routinely analyze library big data and provide new sources of insight into user behavior and needs. The basis for the account-based recommendations begins with clusters of checked out items that the integrated library system records when items are checked out. Drawing on examples from “consumer data science” (e.g. Netflix) it is clear that large corpus data that receive millions of ratings daily are part of the strategy for creating compelling recommender algorithms. Topic metadata clusters, collected from transactional checkout data of items that are checked out together form the basis for generating a rule set. After nearly a year of data stream collection the system has collected over 250,000 rows of anonymized transactions representing checkouts with topic metadata. The research team used the data mining tool WEKA to run a machine learning process offline. Chat transcripts were analyzed using methods from sentiment mining social media data and product reviews to build and test an automated sentiment analyzer. Anonymized transcripts were human-coded for sentiment to produce a gold standard dataset. Freely available natural language learning tools utilizing Python and Scikit-learn were then trained and tested on the dataset to develop an automated sentiment classifier. The classifier reported high levels of precision and accuracy in analyzing the test set of data, and the study revealed a number of fruitful paths to study in refining and implementing analysis into routine assessment activities</p>
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.