INDEX
Explanations
references to specific species or scientific classifications
New Auto-Interp
Negative Logits
ude
-0.15
å±±å¸Ĥ
-0.14
umann
-0.14
zdrav
-0.13
-ce
-0.13
hest
-0.13
Opposition
-0.13
prep
-0.13
rement
-0.13
hel
-0.13
POSITIVE LOGITS
alker
0.16
adol
0.15
سÛĮÙĨ
0.15
avl
0.14
RAIN
0.14
анка
0.14
ynes
0.14
egen
0.14
جÛĮ
0.14
egend
0.14
Activations Density 0.004%