INDEX
    Explanations

    pronouns and verbs related to actions or decisions

    actions related to helping or supporting others

    New Auto-Interp
    Negative Logits
    >>>>>>>>
    -0.59
    ................................
    -0.58
     [/
    -0.57
    reports
    -0.51
    ++++++++
    -0.50
     Explain
    -0.50
    ////////////////////////////////
    -0.49
    udos
    -0.49
    ++++++++++++++++
    -0.49
     Nay
    -0.49
    POSITIVE LOGITS
     supposedly
    0.60
     otherwise
    0.57
    ufact
    0.55
     deems
    0.54
     deemed
    0.53
     inevitably
    0.52
     legitimately
    0.52
     deem
    0.51
    udic
    0.51
     allegedly
    0.50
    Act Density 0.923%

    No Known Activations