Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
The "BEAGLE vectors" folder contains FORTRAN files for each of the BEAGLE vectors. These include: - word_list.txt: Complete list of words - ranked_stop_list.txt: All of the words that were not included when compiling the semantic vectors. - itemNstop396N1024.unformatted: All of the item vectors in FORTRAN format. - orderNstop396N1024.unformatted: All of the order vectors in FORTRAN format. - memNstop396N1024.unformatted: All of the memory vectors in FORTRAN format. These are sums of the item and order vectors and were not used in the paper. - visual.unformatted: Randomly generated environmental vectors for each word. The "Datasets" folder contains each of the datasets from the experiment in .csv format. Each of the individual trials are included and each trial's global similarity values from BEAGLE are also included as columns. A few notes: - Mean or max similarity is noted in the column title ("meanContextCos" = mean item similarity, "maxContextCos" = maximum order similarity). - Order similarity includes both global similarity calculations from the convolution method and from the permutation method. The "OrderCos" columns are derived from the convolution method while the "OrderP5Cos" columns are derived from the permutation method. The "Model code" folder contains a .zip file with all of the files needed to run DE-MCMC with the BEAGLE-DDM model from the paper. The attached Github repo contains code used to construct the vectors. It is based on the BEAGLE model by Mewhort and Jones (2007), in addition to alternative binding approach, using Random Permutaitons (Sahlgren, Holst, & Kanerva, 2008). ... Note that parts of the code may be redundant due to change of approach. For instance, instead of shifting vectors on-the-fly when binding via random permutations, I later changed the code so that we pre-compute the permutation vectors and use Numpy's vector indexing approach to speed up the process. I have tested the code against typical examples to ensure correctness, but perhaps you'll notice something that I missed. Below are links to different vector files: Open CSR formatted vectors using Scipy's sparse library (requires 3 files and the dimensionality ) NOVELS (Random Permutation) >> The corpus is too large to upload to OSF (about 700 mbs), but is available through... https://cloudstor.aarnet.edu.au/plus/s/N7Q8koKZuBLtxyP >> CSR format Dimensionality: 10000x39076 Vocab https://cloudstor.aarnet.edu.au/plus/s/vdiS6UVHAuP3PkH Context indices https://cloudstor.aarnet.edu.au/plus/s/IvqXLuGOaIVtwYy indptr https://cloudstor.aarnet.edu.au/plus/s/4AReRC4thXs0DwO data https://cloudstor.aarnet.edu.au/plus/s/EMSS3cav3ye5IDT Environment indices https://cloudstor.aarnet.edu.au/plus/s/SaXSPzerPJ9piOA indptr https://cloudstor.aarnet.edu.au/plus/s/hOO8cQ84sMketKq data https://cloudstor.aarnet.edu.au/plus/s/V5bG5o4RgCRvBFK Order indices https://cloudstor.aarnet.edu.au/plus/s/Tz8B27SXbNHJn3T indptr https://cloudstor.aarnet.edu.au/plus/s/Q3rdyLp7cuZVSRU data https://cloudstor.aarnet.edu.au/plus/s/dBapH13kMjhH7zt TASA (Holographic) Environment https://cloudstor.aarnet.edu.au/plus/s/l6n0N8aaOyXs6PU Order (Window size of 5) https://cloudstor.aarnet.edu.au/plus/s/khCQrKpDl8l7Ggw Context (Window size of 50) https://cloudstor.aarnet.edu.au/plus/s/1uxFX0dkE9xAwVz vocab: https://cloudstor.aarnet.edu.au/plus/s/pGdFksizXBR1e4w