INDEX
Explanations
references to public disturbances or controversies
New Auto-Interp
Negative Logits
arlı
-0.47
Gump
-0.47
Speise
-0.47
Frühstück
-0.47
AnchorTagHelper
-0.47
lioma
-0.46
Weaknesses
-0.46
ؤية
-0.45
élo
-0.45
paj
-0.43
POSITIVE LOGITS
uproar
1.36
commotion
1.33
fuss
1.23
controversy
1.18
fanfare
1.07
Controversy
1.05
hype
1.02
fuss
1.01
ruck
0.97
furo
0.96
Activations Density 0.262%