INDEX
Explanations
references to genetic experiments involving primates
New Auto-Interp
Negative Logits
poultry
-0.17
urray
-0.17
deer
-0.15
Snake
-0.15
itest
-0.15
adla
-0.14
fish
-0.14
snake
-0.14
iris
-0.14
yre
-0.14
POSITIVE LOGITS
ape
0.37
prim
0.37
monkeys
0.36
monkey
0.35
Prim
0.33
gor
0.33
Monkey
0.32
chimpan
0.31
ap
0.29
chimp
0.28
Activations Density 0.070%