INDEX
Explanations
specific categories of food and animals
New Auto-Interp
Negative Logits
revers
-0.15
349
-0.15
857
-0.14
presso
-0.14
Cha
-0.14
Benson
-0.14
nde
-0.14
509
-0.14
const
-0.14
byn
-0.14
POSITIVE LOGITS
ัà¸ģà¸Ĺ
0.15
æ¿
0.14
atters
0.14
esini
0.14
\helpers
0.14
alars
0.14
Neither
0.14
ngang
0.13
chwitz
0.13
ihan
0.13
Activations Density 0.205%