Dictionaries files are formated based on whether they have weights associated with each term:
- `.csv` (comma separated value) when weighted, with terms in the first column, one category per column after, and values of `original / original max * 100`.
- `.dic` ([LIWC][1]-formated) when unweighted.
Terms in either dictionary format only include Basic Latin characters (ASCII, U+0020 through U+007F):
`'- !\"#$%&()*,./:;?@[\]^_``{|}~+<=>0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ`
The [dict_info.csv][2] file contains information about each dictionary, and its [variables wiki][3] has more information about its columns.
The [lingmatch][4] R package offers tools to download and use these files. `.dic` files can also be loaded into the [adicat highlighter][5] (menu > Dictionary > load/create/edit > load), and potentially exported in JavaScript Object Notation (JSON) format. The [load_dictionary.r][6] and [load_dictionary.py][7] files also offer standalone functions to load dictionaries into R or Python environments.
## Available Dictionaries
- **[As-needed Dictionary-Based Categorization - Function][8]**:
[adicat_function.dic][9] (`8 KB`; `12` language categories; `759` glob+ terms)
- **[Automated Dictionary Creation for Analysis of Text][10]**:
[adict.dic][11] (`165 KB`; `36` social categories; `12,168` ngram terms)
- **[AFINN][12]**:
[afinn.csv][13] (`59 KB`; `1` emotion category; `3,381` unigram terms)
- **[Agency and Communion Dictionaries][14]**:
[agency_communion.dic][15] (`6 KB`; `2` social categories; `447` glob terms)
- **[AllSlang][16]**:
[allslang.dic][17] (`99 KB`; `5` language categories; `10,109` ngram terms)
- **[Affective Norms for English Words][18]**:
[anew.csv][19] (`41 KB`; `3` impression categories; `1,034` unigram terms)
- **[Affective Norms for English Words - emotion ratings][20]**:
[anew_emotion.csv][21] (`51 KB`; `4` emotion categories; `1,034` unigram terms)
- **[BanBuilder][22]**:
[banbuilder.dic][23] (`2 KB`; `1` impression category; `199` unigram terms)
- **[Swear Word List][24]**:
[banned.dic][25] (`1 KB`; `1` impression category; `77` ngram terms)
- **[Cost Benefit Dictionary][26]**:
[cost_benefit.dic][27] (`2 KB`; `2` social categories; `154` glob terms)
- **[Center for Reading Research - Age of Acquisition][28]**:
[crr_aoa.csv][29] (`593 KB`; `1` language category; `31,105` unigram terms)
- **[Center for Reading Research - Concreteness and Familiarity][30]**:
[crr_concreteness.csv][31] (`988 KB`; `2` language categories; `39,954` ngram terms)
- **[Center for Reading Research - Valance, Arousal, and Dominance][32]**:
[crr_vad.csv][33] (`558 KB`; `3` impression categories; `13,915` ngram terms)
- **[DepecheMood++][34]**:
[depechemood.csv][35] (`7.71 MB`; `8` emotion categories; `114,000` unigram terms)
- **[EmoLex][36]**:
[emolex.csv][37] (`1.46 MB`; `8` emotion categories; `28,480` unigram terms)
- **[EmoSenticNet][38]**:
[emosenticnet.dic][39] (`183 KB`; `6` emotion categories; `13,175` ngram terms)
- **[EMOtion TErms (EMOTE)][40]**:
[emote.csv][41] (`198 KB`; `10` impression categories; `2,197` unigram terms)
- **[Geneva Affect Label Coder][42]**:
[galc.dic][43] (`4 KB`; `38` emotion categories; `274` glob terms)
- **[Gendered Wording][44]**:
[gendered_wording.dic][45] (`2 KB`; `2` social categories; `82` glob terms)
- **[Google Profanity Words][46]**:
[google_banned.dic][47] (`5 KB`; `1` impression category; `451` unigram terms)
- **[Hu & Liu][48]**:
[huliu.dic][49] (`83 KB`; `2` emotion categories; `6,789` unigram terms)
- **[Humor Norms][50]**:
[humor.csv][51] (`106 KB`; `1` impression category; `4,997` unigram terms)
- **[General Inquirer][52]**:
[inquirer.dic][53] (`258 KB`; `226` general categories; `8,624` unigram terms)
- **[List of Insults][54]**:
[insults.dic][55] (`8 KB`; `1` impression category; `637` ngram terms)
- **[Language Assessment by Mechanical Turk][56]**:
[labmt.csv][57] (`79 KB`; `1` emotion category; `3,934` unigram terms)
- **[Loughran & McDonald][58]**:
[loughranmcdonald.dic][59] (`55 KB`; `9` emotion categories; `4,159` unigram terms)
- **[Language Use and Social Interaction (Lusi) lab dictionary][60]**:
[lusi.dic][61] (`16 KB`; `10` social categories; `1,601` glob terms)
- **[Language Use and Social Interaction (Lusi) lab - netspeak][62]**:
[lusi_netspeak.dic][63] (`10 KB`; `12` language categories; `1,161` glob+ terms)
- **[Moral Foundations Dictionary][64]**:
[mfd.dic][65] (`25 KB`; `10` social categories; `2,104` ngram terms)
- **[Multi-Perspective Question Answering - Arguing][66]**:
[mpqa_arguing.dic][67] (`12 KB`; `17` language categories; `217` regex terms)
- **[Multi-Perspective Question Answering - Effects][68]**:
[mpqa_effects.dic][69] (`76 KB`; `2` impression categories; `6,568` ngram terms)
- **[Multi-Perspective Question Answering - Subjectivity][70]**:
[mpqa_subjectivity.dic][71] (`111 KB`; `9` language categories; `6,885` unigram terms)
- **[Micro-WordNet Opinion][72]**:
[mwnop.csv][73] (`27 KB`; `2` emotion categories; `1,299` ngram terms)
- **[National Research Council Canada - Word-Colour Association Lexicon][74]**:
[nrc_color.csv][75] (`540 KB`; `11` impression categories; `13,518` unigram terms)
- **[National Research Council Canada - Emotion Intensity Lexicon][76]**:
[nrc_eil.csv][77] (`238 KB`; `8` emotion categories; `5,975` unigram terms)
- **[National Research Council Canada - Macquarie Semantic Orientation Lexicon][78]**:
[nrc_sentiment.dic][79] (`1.04 MB`; `2` emotion categories; `79,280` ngram terms)
- **[National Research Council Canada - Valance, Affect, and Dominance][80]**:
[nrc_vad.csv][81] (`497 KB`; `3` impression categories; `20,007` unigram terms)
- **[National Research Council Canada - Yelp Word-Aspect Association Lexicons][82]**:
[nrc_yelp.csv][83] (`1.56 MB`; `5` emotion categories; `30,376` unigram terms)
- **[Offensive/Profane Word List][84]**:
[offensive.dic][85] (`15 KB`; `1` impression category; `1,383` unigram terms)
- **[Personal Values Dictionary][86]**:
[personal_values.dic][87] (`16 KB`; `14` social categories; `1,068` unigram terms)
- **[Trait Adjectives][88]**:
[personality.dic][89] (`8 KB`; `9` social categories; `532` glob terms)
- **[Pornography][90]**:
[pornography.dic][91] (`12 KB`; `8` social categories; `850` glob terms)
- **[Prosocial Words Dictionary][92]**:
[prosocial.dic][93] (`2 KB`; `1` social category; `145` glob terms)
- **[Regulatory Mode][94]**:
[regulatory_mode.dic][95] (`1 KB`; `2` social categories; `68` glob terms)
- **[SentimentDictionaries - 8-K filings][96]**:
[sd_8k.csv][97] (`4 KB`; `1` emotion category; `173` stem terms)
- **[SentimentDictionaries - IMDB reviews][98]**:
[sd_imdb.csv][99] (`12 KB`; `1` emotion category; `550` stem terms)
- **[English Security/Migration/Terrorism Lexicon][100]**:
[securitization.dic][101] (`3 KB`; `1` social category; `260` glob terms)
- **[SenticNet][102]**:
[senticnet.csv][103] (`2.2 MB`; `1` emotion category; `100,000` ngram terms)
- **[SentiWordNet][104]**:
[sentiwordnet.csv][105] (`604 KB`; `2` emotion categories; `24,123` ngram terms)
- **[Slang Sentiment Dictionary][106]**:
[slangsd.csv][107] (`1.23 MB`; `1` emotion category; `64,061` ngram terms)
- **[Loughran & McDonald - stopwords][108]**:
[stopwords.dic][109] (`128 KB`; `5` language categories; `12,741` unigram terms)
- **[Valence Aware Dictionary for sEntiment Reasoning (Vader)][110]**:
[vader.csv][111] (`124 KB`; `1` emotion category; `7,515` unigram terms)
- **[Whissell][112]**:
[whissell.csv][113] (`300 KB`; `3` impression categories; `8,748` unigram terms)
- **[World Well-Being Project - Affect and Intensity][114]**:
[wwbp_affect_intensity.csv][115] (`62 KB`; `2` impression categories; `2,208` pattern terms)
- **[World Well-Being Project - Age][116]**:
[wwbp_age.csv][117] (`205 KB`; `1` social category; `10,797` pattern terms)
- **[World Well-Being Project - Empathetic Concerns and Personal Distress][118]**:
[wwbp_empathy_distress.csv][119] (`282 KB`; `2` emotion categories; `9,343` pattern terms)
- **[World Well-Being Project - Prospection][120]**:
[wwbp_prospection.csv][121] (`16 KB`; `3` social categories; `556` pattern terms)
- **[World Well-Being Project - Sex][122]**:
[wwbp_sex.csv][123] (`136 KB`; `1` social category; `7,137` pattern terms)
- **[World Well-Being Project - Well-Being][124]**:
[wwbp_wellbeing.csv][125] (`250 KB`; `10` emotion categories; `5,216` pattern terms)
[1]: https://liwc.wpengine.com
[2]: https://osf.io/kjqb8
[3]: https://osf.io/y6g5b/wiki/dict_variables
[4]: https://github.com/miserman/lingmatch
[5]: https://miserman.github.io/adicat/highlight
[6]: https://osf.io/awyje
[7]: https://osf.io/zdsfu
[8]: https://osf.io/y6g5b/wiki/adicat_function
[9]: https://osf.io/download/awsqk
[10]: https://osf.io/y6g5b/wiki/adict
[11]: https://osf.io/download/6sgkd
[12]: https://osf.io/y6g5b/wiki/afinn
[13]: https://osf.io/download/zexfc
[14]: https://osf.io/y6g5b/wiki/agency_communion
[15]: https://osf.io/download/284xp
[16]: https://osf.io/y6g5b/wiki/allslang
[17]: https://osf.io/download/ghc8a
[18]: https://osf.io/y6g5b/wiki/anew
[19]: https://osf.io/download/cq6ng
[20]: https://osf.io/y6g5b/wiki/anew_emotion
[21]: https://osf.io/download/3vg9n
[22]: https://osf.io/y6g5b/wiki/banbuilder
[23]: https://osf.io/download/8hdnx
[24]: https://osf.io/y6g5b/wiki/banned
[25]: https://osf.io/download/mxhrc
[26]: https://osf.io/y6g5b/wiki/cost_benefit
[27]: https://osf.io/download/5vrn4
[28]: https://osf.io/y6g5b/wiki/crr_aoa
[29]: https://osf.io/download/fwvz6
[30]: https://osf.io/y6g5b/wiki/crr_concreteness
[31]: https://osf.io/download/ez2m4
[32]: https://osf.io/y6g5b/wiki/crr_vad
[33]: https://osf.io/download/j7ek9
[34]: https://osf.io/y6g5b/wiki/depechemood
[35]: https://osf.io/download/2h4ty
[36]: https://osf.io/y6g5b/wiki/emolex
[37]: https://osf.io/download/q7g3h
[38]: https://osf.io/y6g5b/wiki/emosenticnet
[39]: https://osf.io/download/q8uzr
[40]: https://osf.io/y6g5b/wiki/emote
[41]: https://osf.io/download/vxfwt
[42]: https://osf.io/y6g5b/wiki/galc
[43]: https://osf.io/download/rd4bt
[44]: https://osf.io/y6g5b/wiki/gendered_wording
[45]: https://osf.io/download/p4r7z
[46]: https://osf.io/y6g5b/wiki/google_banned
[47]: https://osf.io/download/f7my6
[48]: https://osf.io/y6g5b/wiki/huliu
[49]: https://osf.io/download/nqbwh
[50]: https://osf.io/y6g5b/wiki/humor
[51]: https://osf.io/download/5z2me
[52]: https://osf.io/y6g5b/wiki/inquirer
[53]: https://osf.io/download/nurx3
[54]: https://osf.io/y6g5b/wiki/insults
[55]: https://osf.io/download/twyjb
[56]: https://osf.io/y6g5b/wiki/labmt
[57]: https://osf.io/download/a6fs9
[58]: https://osf.io/y6g5b/wiki/loughranmcdonald
[59]: https://osf.io/download/vey62
[60]: https://osf.io/y6g5b/wiki/lusi
[61]: https://osf.io/download/29ayf
[62]: https://osf.io/y6g5b/wiki/lusi_netspeak
[63]: https://osf.io/download/twb9s
[64]: https://osf.io/y6g5b/wiki/mfd
[65]: https://osf.io/download/smqgz
[66]: https://osf.io/y6g5b/wiki/mpqa_arguing
[67]: https://osf.io/download/8gb9q
[68]: https://osf.io/y6g5b/wiki/mpqa_effects
[69]: https://osf.io/download/jf6n9
[70]: https://osf.io/y6g5b/wiki/mpqa_subjectivity
[71]: https://osf.io/download/pxy4m
[72]: https://osf.io/y6g5b/wiki/mwnop
[73]: https://osf.io/download/gu5qx
[74]: https://osf.io/y6g5b/wiki/nrc_color
[75]: https://osf.io/download/z5pm3
[76]: https://osf.io/y6g5b/wiki/nrc_eil
[77]: https://osf.io/download/fjx6n
[78]: https://osf.io/y6g5b/wiki/nrc_sentiment
[79]: https://osf.io/download/v8h39
[80]: https://osf.io/y6g5b/wiki/nrc_vad
[81]: https://osf.io/download/v3e9g
[82]: https://osf.io/y6g5b/wiki/nrc_yelp
[83]: https://osf.io/download/w4nmq
[84]: https://osf.io/y6g5b/wiki/offensive
[85]: https://osf.io/download/98tbh
[86]: https://osf.io/y6g5b/wiki/personal_values
[87]: https://osf.io/download/xv379
[88]: https://osf.io/y6g5b/wiki/personality
[89]: https://osf.io/download/nqbs3
[90]: https://osf.io/y6g5b/wiki/pornography
[91]: https://osf.io/download/xzc5d
[92]: https://osf.io/y6g5b/wiki/prosocial
[93]: https://osf.io/download/awh6p
[94]: https://osf.io/y6g5b/wiki/regulatory_mode
[95]: https://osf.io/download/aens9
[96]: https://osf.io/y6g5b/wiki/sd_8k
[97]: https://osf.io/download/7kh9v
[98]: https://osf.io/y6g5b/wiki/sd_imdb
[99]: https://osf.io/download/pfu38
[100]: https://osf.io/y6g5b/wiki/securitization
[101]: https://osf.io/download/ypgcn
[102]: https://osf.io/y6g5b/wiki/senticnet
[103]: https://osf.io/download/kf4ra
[104]: https://osf.io/y6g5b/wiki/sentiwordnet
[105]: https://osf.io/download/mtwfr
[106]: https://osf.io/y6g5b/wiki/slangsd
[107]: https://osf.io/download/skrf3
[108]: https://osf.io/y6g5b/wiki/stopwords
[109]: https://osf.io/download/yt6wr
[110]: https://osf.io/y6g5b/wiki/vader
[111]: https://osf.io/download/nt5zx
[112]: https://osf.io/y6g5b/wiki/whissell
[113]: https://osf.io/download/fbz36
[114]: https://osf.io/y6g5b/wiki/wwbp_affect_intensity
[115]: https://osf.io/download/7thab
[116]: https://osf.io/y6g5b/wiki/wwbp_age
[117]: https://osf.io/download/urvtb
[118]: https://osf.io/y6g5b/wiki/wwbp_empathy_distress
[119]: https://osf.io/download/zfyk4
[120]: https://osf.io/y6g5b/wiki/wwbp_prospection
[121]: https://osf.io/download/cxz3j
[122]: https://osf.io/y6g5b/wiki/wwbp_sex
[123]: https://osf.io/download/h6uqz
[124]: https://osf.io/y6g5b/wiki/wwbp_wellbeing
[125]: https://osf.io/download/rwjtk