INDEX
Explanations
politically charged language surrounding government actions and controversies
New Auto-Interp
Negative Logits
nece
-0.58
--){-0.56
είται
-0.56
OnEnable
-0.56
indisponible
-0.55
susten
-0.54
transférez
-0.54
Nego
-0.54
ʁ
-0.54
nhold
-0.53
POSITIVE LOGITS
pourtant
0.66
StructEnd
0.52
GeneratedCode
0.51
ItemBackground
0.50
sekal
0.49
ftagPool
0.49
rağmen
0.45
FAILED
0.44
failed
0.43
supposedly
0.43
Activations Density 0.495%