INDEX
    Explanations

    phrases indicating political norms and behaviors, particularly focusing on denial or acceptance of certain truths

    Follows words like "for" or "as"

    New Auto-Interp
    Negative Logits
     Wikiseite
    -0.67
    Appropriate
    -0.55
     engraçadas
    -0.53
     Appropriate
    -0.51
    calo
    -0.49
    subpackage
    -0.47
     pauvre
    -0.47
    appropriate
    -0.46
     attentive
    -0.45
     רבה
    -0.45
    POSITIVE LOGITS
     guaranteed
    1.03
     unquestion
    1.03
     indisputable
    1.01
     invio
    0.96
     undisputed
    0.95
     certainty
    0.90
     automatic
    0.90
     settled
    0.89
     irreversible
    0.88
     immutable
    0.88
    Act Density 0.522%

    No Known Activations