INDEX
Explanations
references to madness and insanity
madness, insanity, and mania
New Auto-Interp
Negative Logits
-0.56
-0.42
➥
-0.40
Administrativna
-0.40
Utf
-0.39
Sikh
-0.38
Carson
-0.38
Shir
-0.38
etsk
-0.37
zeera
-0.37
POSITIVE LOGITS
madness
1.87
Madness
1.76
Madness
1.59
insanity
1.23
locura
1.23
frenzy
1.02
mania
0.95
craz
0.94
Mania
0.86
chaos
0.85
Activations Density 0.004%