INDEX
Explanations
words related to anatomical or physiological components
New Auto-Interp
Negative Logits
ÅĦ
-0.16
upal
-0.16
unger
-0.15
azio
-0.15
ftar
-0.15
\Traits
-0.15
488
-0.14
šek
-0.14
peria
-0.14
ÄIJT
-0.14
POSITIVE LOGITS
oref
0.17
rack
0.15
trad
0.15
alg
0.14
atak
0.14
iona
0.14
etur
0.14
ά
0.14
Stars
0.14
naken
0.14
Activations Density 0.000%