INDEX
Explanations
mentions of primate species, specifically great apes like chimpanzees and gorillas
references to different species of primates, particularly great apes
New Auto-Interp
Negative Logits
HUD
-0.80
sol
-0.74
oppable
-0.74
gren
-0.71
bringer
-0.70
spr
-0.67
Seym
-0.67
bg
-0.66
ved
-0.66
EA
-0.65
POSITIVE LOGITS
zee
1.14
zees
1.14
chimpanzees
1.12
apes
1.00
chimpan
0.94
primates
0.91
gorilla
0.89
Haram
0.87
saf
0.77
utan
0.77
Activations Density 0.037%