INDEX
Explanations
instances of criticism or negative evaluation of authority figures or societal issues
allegationssanctionscriticised
New Auto-Interp
Negative Logits
-0.51
HtmlAttribute
-0.48
InjectAttribute
-0.41
Personendaten
-0.40
SequentialGroup
-0.39
AssemblyTitle
-0.39
fieldNum
-0.37
صوتيه
-0.36
Taktlose
-0.35
autorytatywna
-0.35
POSITIVE LOGITS
zuletzt
0.46
خارجية
0.46
برانيه
0.41
estekak
0.40
irvana
0.39
äj
0.39
stdc
0.39
omitempty
0.38
evermore
0.38
TabStop
0.38
Activations Density 0.011%