INDEX
Explanations
expressions that signify possession or association
New Auto-Interp
Negative Logits
uru
-0.16
elm
-0.15
obile
-0.14
ibu
-0.14
udi
-0.14
otto
-0.14
resent
-0.14
nic
-0.14
uj
-0.13
foobar
-0.13
POSITIVE LOGITS
lical
0.16
imson
0.15
Marsh
0.14
cardinal
0.14
409
0.14
orian
0.14
Rol
0.14
scal
0.14
essional
0.14
Dillon
0.13
Activations Density 0.027%