INDEX
Explanations
words that indicate companionship or connection
New Auto-Interp
Negative Logits
pty
-0.16
ppe
-0.16
OMATIC
-0.14
IFORM
-0.14
ookie
-0.14
ESH
-0.14
_MAY
-0.14
jection
-0.14
sse
-0.14
ookies
-0.14
POSITIVE LOGITS
foes
0.22
enemies
0.21
foe
0.21
enemy
0.20
Enemies
0.17
Enemies
0.17
acquaint
0.17
relatives
0.17
enemy
0.16
family
0.15
Activations Density 0.034%