INDEX
Explanations
instances of criticism or controversial statements regarding public figures or institutions
New Auto-Interp
Negative Logits
ș
-0.40
譙
-0.39
BuildContext
-0.38
ionized
-0.38
argint
-0.38
ionization
-0.37
analyze
-0.36
HUR
-0.36
vacation
-0.34
BrowserModule
-0.34
POSITIVE LOGITS
sizeCache
0.60
elemField
0.57
rungsseite
0.56
oa̍t
0.54
cowboys
0.54
appalling
0.54
catalogue
0.52
catalogue
0.51
criticise
0.50
cowboy
0.49
Activations Density 0.434%