INDEX
Explanations
phrases indicating the impact or influence of certain actions
instances of the word "by" indicating causation or influence
New Auto-Interp
Negative Logits
raq
-0.82
ASE
-0.77
ourage
-0.76
iatrics
-0.75
NAS
-0.74
yssey
-0.73
heastern
-0.72
eva
-0.72
trap
-0.70
ppa
-0.69
POSITIVE LOGITS
virtue
0.95
products
0.93
sheer
0.80
nature
0.79
mobs
0.76
lightning
0.72
Hurricane
0.70
illness
0.69
accident
0.69
rumours
0.68
Activations Density 0.097%