INDEX
Explanations
verbs indicating actions or goals
instances of the word "take" and its variations
New Auto-Interp
Negative Logits
accompanies
-0.74
agre
-0.70
tions
-0.67
ese
-0.67
hess
-0.63
ateg
-0.60
Gear
-0.60
rehens
-0.58
ominated
-0.58
vine
-0.58
POSITIVE LOGITS
advantage
1.42
care
1.02
aways
1.02
aback
0.94
liberties
0.94
away
0.93
advant
0.90
cues
0.88
pains
0.88
responsibility
0.87
Activations Density 0.092%