INDEX
Explanations
references to different types of primates such as apes, gorillas, and primates in general
references to apes and gorillas
New Auto-Interp
Negative Logits
gren
-0.80
WIND
-0.71
tor
-0.71
Nap
-0.70
tw
-0.69
ttp
-0.68
vy
-0.66
spring
-0.66
mph
-0.66
demand
-0.66
POSITIVE LOGITS
apes
1.26
chimpanzees
1.24
primates
1.13
gorilla
1.10
zee
1.05
Haram
1.04
zees
1.00
chimpan
1.00
ape
0.93
monkeys
0.92
Activations Density 0.013%