INDEX
    Explanations

    verbs denoting communication such as speaking, saying, and thinking

    verbs indicating ongoing actions or contributions

    New Auto-Interp
    Negative Logits
    selves
    -0.78
    Higher
    -0.70
    respective
    -0.70
    wayne
    -0.69
    ocating
    -0.68
    iners
    -0.67
    iky
    -0.65
    RELATED
    -0.64
    wik
    -0.63
    illion
    -0.62
    POSITIVE LOGITS
     himself
    0.82
     herself
    0.76
     his
    0.67
     brilliantly
    0.65
     shotgun
    0.63
     Bord
    0.60
     unsuccessfully
    0.60
     sage
    0.59
     valiant
    0.59
    ographs
    0.59
    Act Density 0.420%

    No Known Activations