Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
# Replication Data for Fine-Grained Implicit Sentiment in Financial News: Uncovering Hidden Bulls and Bears. This replication repository is split across several source code with data sub-repositories that containing source code and data for the experiments in *"Fine-Grained Implicit Sentiment in Financial News: Uncovering Hidden Bulls and Bears." by Gilles Jacobs and Véronique Hoste*. ## 1. Pre-processing with `sentivent-webanno-parser' Contains scripts for parsing and pre-processing the original WebAnno export to the data formats required for experiments in paper at https://github.com/GillesJ/sentivent_webannoparser NOTE: The original SENTiVENT WebAnno export files will be released after the project ends (end of 2021). Use: - run `python parse_to_implicit_polar.py` to preprocess orginal WebAnno export for coarse-grained gold polar expression polarity classification experiments. - Run `python parse_to_clause.py` to preprocess for coarse-grained clause-based experiments. - Run `python parse_to_gts.py` to pre-process for fine-grained triplet BIO format. - There are several other scripts regarding IAA and viz. that are self-documenting. ## 2. Coarse-grained experiments repository Contains replication data in '/data' subfolder and hyperoptim. training, testing, validation source code. Several utility scripts for the pre-processing of lexicons are also included, but the lexicons cannot be included due to copyright issues. https://github.com/GillesJ/sentivent-implicit-economic-sentiment Use: - Model training and tokenization code in `custom_model.py` and `custom_classification_model.py`, run hyperoptim search with `python hyperopt_model_train.py` - `/data/`: contains coarse-grained datasets in json format. - `/utils/`: various utility scripts for viz., lexicon pre-processing, and EDA. ## 3. Fine-grained triplet experiments repository Contains replication data in '/data' subfolder and hyperoptim. training, testing, validation source code fo Triplet extraction. This repositoy is a fork of the original work on GTS by Wu et al. (2020) (make sure you are on `#sentivent` branch. https://github.com/GillesJ/GTS/tree/sentivent Use: - Run `python src/hyperopt.py` to train hyperoptim search. Collect results in tables with `collect_` ## 4. Hyperparameter optimization search individual results We used wandb.ai built-in hyperparameter sweep functionality with hyperband early stopping. The implementation is in each experiment series code-base. For each encoder model + lexicon feature group there is a different project page with runs. The webpages listed below allow inspection of individual runs: - No lex.: No lexicon features. - Econ.: Economic domain lexicons. - All: General-domain and economic domain lexicon features). ### 4.1. Coarse-grained gold polar expressions - RoBERTa-large: [No lex.](https://wandb.ai/gillesjacobs/senti-nolex-roberta-large), [Econ.](https://wandb.ai/gillesjacobs/senti-lexecon-roberta-large), [All](https://wandb.ai/gillesjacobs/senti-lexall-roberta-large) - RoBERTa-base: [No lex.](https://wandb.ai/gillesjacobs/senti-nolex-roberta-base), [Econ.](https://wandb.ai/gillesjacobs/senti-lexecon-roberta-base), [All](https://wandb.ai/gillesjacobs/senti-lexall-roberta-base) - BERT-large: [No lex.](https://wandb.ai/gillesjacobs/senti-nolex-bert-large-cased), [Econ.](https://wandb.ai/gillesjacobs/senti-lexecon-bert-large-cased), [All](https://wandb.ai/gillesjacobs/senti-lexall-bert-large-cased) - BERT-base: [No lex.](https://wandb.ai/gillesjacobs/senti-nolex-bert-base-cased-fix), [Econ.](https://wandb.ai/gillesjacobs/senti-lexecon-bert-base-cased-fix), [All](https://wandb.ai/gillesjacobs/senti-lexall-bert-base-cased-fix) - DeBERTa-base: [No lex.](https://wandb.ai/gillesjacobs/senti-nolex-microsoft-deberta-base), [Econ.](https://wandb.ai/gillesjacobs/senti-lexecon-microsoft-deberta-base), [All](https://wandb.ai/gillesjacobs/senti-lexall-microsoft-deberta-base) - FinBERT-Finvocab: [No lex.](https://wandb.ai/gillesjacobs/senti-nolex-finbert-finvocab-uncased), [Econ.](https://wandb.ai/gillesjacobs/senti-lexecon-finbert-finvocab-uncased), [All](https://wandb.ai/gillesjacobs/senti-lexall-finbert-finvocab-uncased) - FinBERT-TRC2+FP: [No lex.](https://wandb.ai/gillesjacobs/senti-nolex-ProsusAI-finbert), [Econ.](https://wandb.ai/gillesjacobs/senti-lexecon-ProsusAI-finbert), [All](https://wandb.ai/gillesjacobs/senti-lexall-ProsusAI-finbert) ### 4.2. Coarse-grained clause-based experiments - BERT-large : [No lex.](https://wandb.ai/gillesjacobs/impliclaus-nolex-bert-large-cased), [Econ.](https://wandb.ai/gillesjacobs/impliclaus-lexecon-bert-large-cased), [All](https://wandb.ai/gillesjacobs/impliclaus-lexall-bert-large-cased) - BERT-base: [No lex.](https://wandb.ai/gillesjacobs/impliclaus-nolex-bert-base-cased), [Econ.](https://wandb.ai/gillesjacobs/impliclaus-lexecon-bert-base-cased), [All](https://wandb.ai/gillesjacobs/impliclaus-lexall-bert-base-cased) - FinBERT-FinVocab: [No lex.](https://wandb.ai/gillesjacobs/impliclaus-nolex-finbert-finvocab-uncased), [Econ.](https://wandb.ai/gillesjacobs/impliclaus-lexecon-finbert-finvocab-uncased), [All](https://wandb.ai/gillesjacobs/impliclaus-lexall-finbert-finvocab-uncased) - RoBERTa-base: [No lex.](https://wandb.ai/gillesjacobs/impliclaus-nolex-roberta-base), [Econ.](https://wandb.ai/gillesjacobs/impliclaus-lexecon-roberta-base), [All](https://wandb.ai/gillesjacobs/impliclaus-lexall-roberta-base) - DeBERTa-base: [No lex.](https://wandb.ai/gillesjacobs/impliclaus-nolex-microsoft-deberta-base), [Econ.](https://wandb.ai/gillesjacobs/impliclaus-lexecon-microsoft-deberta-base), [All](https://wandb.ai/gillesjacobs/impliclaus-lexall-microsoft-deberta-base) - FinBERT-TRC2+FP: [No lex.](https://wandb.ai/gillesjacobs/impliclaus-nolex-ProsusAI-finbert), [Econ.](https://wandb.ai/gillesjacobs/impliclaus-lexecon-ProsusAI-finbert), [All](https://wandb.ai/gillesjacobs/impliclaus-lexall-ProsusAI-finbert) - RoBERTa-large: [No lex.](https://wandb.ai/gillesjacobs/impliclaus-nolex-roberta-large), [Econ.](https://wandb.ai/gillesjacobs/impliclaus-lexecon-roberta-large), [All](https://wandb.ai/gillesjacobs/impliclaus-lexall-roberta-large) ### 4.3. Fine-grained triplet experiments SENTiVENT (ours): - [DeBERTa-base](https://wandb.ai/gillesjacobs/microsoft_deberta_base-triplet-sentivent) - [FinBERT-TRC2+FP](https://wandb.ai/gillesjacobs/prosusai_finbert-triplet-sentivent) - [BERT-large](https://wandb.ai/gillesjacobs/bert_large_cased-triplet-sentivent) - [BERT-base](https://wandb.ai/gillesjacobs/bert_base_cased-triplet-sentivent) - [RoBERTa-base](https://wandb.ai/gillesjacobs/roberta_base-triplet-sentivent) - [RoBERTa-large](https://wandb.ai/gillesjacobs/roberta_large-triplet-sentivent) Explicit Wu et al. (2020): - [RoBERTa-base](https://wandb.ai/gillesjacobs/roberta_base-triplet-joinedsemeval) - [RoBERTa-large](https://wandb.ai/gillesjacobs/roberta_large-triplet-joinedsemeval)
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.