INDEX
    Explanations

    first-person pronouns and expressions of personal experience or feeling

    New Auto-Interp
    Negative Logits
     Enlight
    -0.16
     enlight
    -0.16
     pioneered
    -0.15
     Typ
    -0.15
     pioneering
    -0.15
     enlightened
    -0.15
    elog
    -0.15
    quisite
    -0.15
    PFN
    -0.14
    lington
    -0.14
    POSITIVE LOGITS
     wanted
    0.20
    rew
    0.19
    wanted
    0.18
     drew
    0.16
     Wanted
    0.16
    annis
    0.15
    unix
    0.15
    æ·»
    0.15
     ioutil
    0.15
     channel
    0.14
    Act Density 0.135%

    No Known Activations