INDEX
Explanations
proper names, particularly of individuals
New Auto-Interp
Negative Logits
Tep
-0.18
esti
-0.17
ROTO
-0.15
cura
-0.15
ansa
-0.15
enville
-0.15
otti
-0.15
osti
-0.15
arty
-0.15
nelly
-0.15
POSITIVE LOGITS
404
0.15
Verde
0.15
orex
0.15
dish
0.14
exual
0.14
_subset
0.14
rint
0.13
secretly
0.13
Cre
0.13
Pipes
0.13
Activations Density 0.050%