INDEX
    Explanations

    actions or intentions related to knowing or understanding

    statements expressing certainty or confidence in one's knowledge or abilities

    New Auto-Interp
    Negative Logits
     vig
    -0.65
    hement
    -0.63
     anytime
    -0.62
    å§«
    -0.61
     Provided
    -0.61
    mony
    -0.60
     Awareness
    -0.58
     somew
    -0.58
     Lives
    -0.57
     Transparency
    -0.56
    POSITIVE LOGITS
    sbm
    0.75
     talking
    0.73
    /$
    0.70
    rowing
    0.65
    READ
    0.65
    doing
    0.65
     barg
    0.63
    PLA
    0.63
     supposed
    0.63
    talking
    0.63
    Act Density 0.120%

    No Known Activations