INDEX
    Explanations

    statements that involve disagreement or correction

    Uncertainty or disagreement

    statements of fact or opinion

    New Auto-Interp
    Negative Logits
    ]='\
    -0.75
     виправивши
    -0.61
    dflare
    -0.56
    nste
    -0.51
     [*]
    -0.50
    )|^{
    -0.49
    Associated
    -0.49
    Pasos
    -0.49
    ImageContext
    -0.49
    Curi
    -0.48
    POSITIVE LOGITS
    这话
    0.91
     assertion
    0.84
     claim
    0.81
     truth
    0.80
    这句话
    0.77
     statement
    0.76
     afirma
    0.76
     opinion
    0.74
     Behaup
    0.74
     verdade
    0.73
    Act Density 0.584%

    No Known Activations