INDEX
Explanations
terms related to pedophilia and pedophiles
New Auto-Interp
Negative Logits
publique
-0.67
Strauss
-0.66
nox
-0.65
puol
-0.65
hloromethane
-0.65
魂
-0.65
Márquez
-0.65
çais
-0.63
Chan
-0.63
soud
-0.63
POSITIVE LOGITS
Ped
2.79
ped
2.76
Ped
2.66
ped
2.35
PED
2.28
pedals
2.02
PED
1.99
pedal
1.70
Pedal
1.68
pedi
1.56
Activations Density 0.112%