INDEX
Explanations
terms related to various species and their biological classifications
New Auto-Interp
Negative Logits
Animals
-0.18
rats
-0.17
animals
-0.17
animals
-0.16
animal
-0.16
rodents
-0.15
Animal
-0.15
rats
-0.15
eration
-0.15
extensions
-0.15
POSITIVE LOGITS
populations
0.24
population
0.23
/cat
0.18
species
0.18
husband
0.17
population
0.17
-human
0.17
physiology
0.17
friendly
0.16
/pl
0.16
Activations Density 0.105%