Main content



Loading wiki pages...

Wiki Version:
# Replication Data for Fine-Grained Implicit Sentiment in Financial News: Uncovering Hidden Bulls and Bears. This replication repository is split across several source code with data sub-repositories that containing source code and data for the experiments in *"Fine-Grained Implicit Sentiment in Financial News: Uncovering Hidden Bulls and Bears." by Gilles Jacobs and Véronique Hoste*. ## 1. Pre-processing with `sentivent-webanno-parser' Contains scripts for parsing and pre-processing the original WebAnno export to the data formats required for experiments in paper at NOTE: The original SENTiVENT WebAnno export files will be released after the project ends (end of 2021). Use: - run `python` to preprocess orginal WebAnno export for coarse-grained gold polar expression polarity classification experiments. - Run `python` to preprocess for coarse-grained clause-based experiments. - Run `python` to pre-process for fine-grained triplet BIO format. - There are several other scripts regarding IAA and viz. that are self-documenting. ## 2. Coarse-grained experiments repository Contains replication data in '/data' subfolder and hyperoptim. training, testing, validation source code. Several utility scripts for the pre-processing of lexicons are also included, but the lexicons cannot be included due to copyright issues. Use: - Model training and tokenization code in `` and ``, run hyperoptim search with `python` - `/data/`: contains coarse-grained datasets in json format. - `/utils/`: various utility scripts for viz., lexicon pre-processing, and EDA. ## 3. Fine-grained triplet experiments repository Contains replication data in '/data' subfolder and hyperoptim. training, testing, validation source code fo Triplet extraction. This repositoy is a fork of the original work on GTS by Wu et al. (2020) (make sure you are on `#sentivent` branch. Use: - Run `python src/` to train hyperoptim search. Collect results in tables with `collect_` ## 4. Hyperparameter optimization search individual results We used built-in hyperparameter sweep functionality with hyperband early stopping. The implementation is in each experiment series code-base. For each encoder model + lexicon feature group there is a different project page with runs. The webpages listed below allow inspection of individual runs: - No lex.: No lexicon features. - Econ.: Economic domain lexicons. - All: General-domain and economic domain lexicon features). ### 4.1. Coarse-grained gold polar expressions - RoBERTa-large: [No lex.](, [Econ.](, [All]( - RoBERTa-base: [No lex.](, [Econ.](, [All]( - BERT-large: [No lex.](, [Econ.](, [All]( - BERT-base: [No lex.](, [Econ.](, [All]( - DeBERTa-base: [No lex.](, [Econ.](, [All]( - FinBERT-Finvocab: [No lex.](, [Econ.](, [All]( - FinBERT-TRC2+FP: [No lex.](, [Econ.](, [All]( ### 4.2. Coarse-grained clause-based experiments - BERT-large : [No lex.](, [Econ.](, [All]( - BERT-base: [No lex.](, [Econ.](, [All]( - FinBERT-FinVocab: [No lex.](, [Econ.](, [All]( - RoBERTa-base: [No lex.](, [Econ.](, [All]( - DeBERTa-base: [No lex.](, [Econ.](, [All]( - FinBERT-TRC2+FP: [No lex.](, [Econ.](, [All]( - RoBERTa-large: [No lex.](, [Econ.](, [All]( ### 4.3. Fine-grained triplet experiments SENTiVENT (ours): - [DeBERTa-base]( - [FinBERT-TRC2+FP]( - [BERT-large]( - [BERT-base]( - [RoBERTa-base]( - [RoBERTa-large]( Explicit Wu et al. (2020): - [RoBERTa-base]( - [RoBERTa-large](
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.