INDEX
Explanations
phrases related to controversial or sensitive topics, including political conflicts and criminal activities
New Auto-Interp
Negative Logits
fax
-0.77
uncture
-0.76
oire
-0.75
bara
-0.74
atform
-0.73
iologist
-0.71
enance
-0.71
iasco
-0.69
ebin
-0.68
ruption
-0.68
POSITIVE LOGITS
replacements
1.28
selves
1.23
losers
1.22
ambassadors
1.22
slaves
1.21
selves
1.17
heroes
1.16
combatants
1.15
pioneers
1.15
carriers
1.13
Activations Density 0.383%