INDEX
    Explanations

    assertions related to merit and credibility in arguments or claims

    New Auto-Interp
    Negative Logits
     myſelf
    -0.85
     pleaſure
    -0.84
     itſelf
    -0.82
     Efq
    -0.81
     purpoſe
    -0.80
     greateſt
    -0.77
     Conſ
    -0.77
     '\\;'
    -0.76
     Reſ
    -0.75
     reaſon
    -0.74
    POSITIVE LOGITS
     proposta
    0.56
     предложение
    0.53
     arguments
    0.51
     claims
    0.51
     идеи
    0.51
    Tazama
    0.51
     refuted
    0.51
     défend
    0.50
     claim
    0.49
    hver
    0.47
    Act Density 0.704%

    No Known Activations