<p><strong>Traffic Event Extraction From Tweets</strong></p> <p>This project has two major components: (1) Annotator and (2) Extractor</p> <p><strong>Annotator</strong></p> <p>Sequence labeling model trained with declarative knowledge from location and event knowledge base is utilized for annotation of raw tweets. Open Street Maps [1] is used as a location based knowledge specific to a city and <a href="http://511.org" rel="nofollow">511.org</a> [2] schema of events is used as a knowledge of traffic related events. Each word in a tweet is assigned a tag (one of: B-LOCATION, I-LOCATION, B-EVENT, I-EVENT, OTHER).</p> <p>Download all the data files from [3] and place it in a directory called "data". Download all the models (from files tab) and place it in a directory called "models". You can invoke the annotator using the command: </p> <p><em>java -cp eventannotation.jar org.ccsr.tagging.CreateAnnotatedData models/model-twitter</em></p> <p>This code will take a while to run and the output is a file containing all the event terms and locations (this file is named final-training-data.txt). This file is the input for the extraction phase that follows.</p> <p><strong>Extractor</strong></p> <p>Extraction algorithms use space, time and theme characteristic of city events to aggregate all the tags for emitting events.</p> <p>Download <em><a href="http://extractevents.py" rel="nofollow">extractevents.py</a></em> and place the output of the annotation phase (final-training-data.txt) in a directory called "data". Invoke the python script for aggregating annotations to emit events using the command:</p> <p><em>/usr/bin/python <a href="http://extractevents.py" rel="nofollow">extractevents.py</a></em> </p> <p><strong><em>Visualization</em></strong> We have created a prototype to visualize all the city events both from city department (<a href="http://511.org" rel="nofollow">511.org</a>) and the events we have extracted from tweets -- <a href="http://bit.ly/1gcSvLz" rel="nofollow">http://bit.ly/1gcSvLz</a></p> <p><strong>References</strong></p> <p>[1] Open Street Maps: <a href="http://www.openstreetmap.org/" rel="nofollow">http://www.openstreetmap.org/</a></p> <p>[2] <a href="http://511.org" rel="nofollow">511.org</a> knowledge of traffic events: <a href="http://511.org/docs/TOMSSchema.zip" rel="nofollow">http://511.org/docs/TOMSSchema.zip</a></p> <p>[3] Dataset used for experiments: <a href="https://app.box.com/s/uvws6ztf5jzbc8cxmb9b4r6a1zuei0pt" rel="nofollow">https://app.box.com/s/uvws6ztf5jzbc8cxmb9b4r6a1zuei0pt</a></p> <p><a href="http://creativecommons.org/licenses/by-nc-sa/4.0/" rel="nofollow"><img alt="Creative Commons License" src="http://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" style=""></a><br>This work is licensed under a <a href="http://creativecommons.org/licenses/by-nc-sa/4.0/" rel="nofollow">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.</p>
