INDEX
Explanations
concepts related to animal rights and welfare
New Auto-Interp
Negative Logits
ÑĨем
-0.16
nis
-0.16
viper
-0.15
iskey
-0.15
SYM
-0.15
511
-0.15
remen
-0.15
ÑĪка
-0.14
vl
-0.14
oled
-0.14
POSITIVE LOGITS
animal
0.34
Animal
0.34
cruelty
0.31
Animal
0.30
animal
0.28
Cru
0.27
animals
0.27
humane
0.24
animals
0.23
Animals
0.22
Activations Density 0.111%