INDEX
Explanations
phrases related to animal life and their interactions with the environment
New Auto-Interp
Negative Logits
fucking
-0.17
.matcher
-0.15
fucked
-0.15
hell
-0.15
fuck
-0.14
Peripheral
-0.14
Fucking
-0.14
oxetine
-0.14
fuck
-0.14
fucks
-0.13
POSITIVE LOGITS
Fin
0.20
Fin
0.18
animals
0.18
Animals
0.17
fin
0.17
Mo
0.16
mr
0.15
animal
0.15
Moo
0.15
Mr
0.15
Activations Density 0.036%