INDEX
Explanations
phrases indicating possession or ownership
possessive forms related to people
New Auto-Interp
Negative Logits
obin
-0.85
ulhu
-0.84
Ͻ
-0.81
rette
-0.79
udo
-0.74
uin
-0.73
lished
-0.73
ctor
-0.72
zin
-0.71
crow
-0.70
POSITIVE LOGITS
throats
1.08
necks
1.02
noses
1.00
identities
0.99
efforts
0.98
bodies
0.98
minds
0.97
mouths
0.97
frustrations
0.97
brains
0.96
Activations Density 0.071%