INDEX
Explanations
phrases indicating political norms and behaviors, particularly focusing on denial or acceptance of certain truths
Follows words like "for" or "as"
taken for granted
New Auto-Interp
Negative Logits
Wikiseite
-0.67
Appropriate
-0.55
engraçadas
-0.53
Appropriate
-0.51
calo
-0.49
subpackage
-0.47
pauvre
-0.47
appropriate
-0.46
attentive
-0.45
רבה
-0.45
POSITIVE LOGITS
guaranteed
1.03
unquestion
1.03
indisputable
1.01
invio
0.96
undisputed
0.95
certainty
0.90
automatic
0.90
settled
0.89
irreversible
0.88
immutable
0.88
Activations Density 0.522%