INDEX
Explanations
phrases related to actions or events happening or being done 'to' someone or something
instances of the word "to" indicating purpose or intention
New Auto-Interp
Negative Logits
typ
-0.60
disadvant
-0.59
emort
-0.57
hemor
-0.55
behav
-0.54
shenan
-0.54
tun
-0.54
Seym
-0.53
vulner
-0.53
Vaugh
-0.53
POSITIVE LOGITS
ggles
0.83
wered
0.81
ilet
0.77
celebrate
0.76
relieve
0.74
obtain
0.74
promote
0.73
accommodate
0.72
asted
0.72
avoid
0.72
Activations Density 0.199%