INDEX
    Explanations

    verbs indicating actions or behavior

    negative statements about actions or experiences

    New Auto-Interp
    Negative Logits
    accompan
    -0.76
    unknown
    -0.74
    accompanied
    -0.66
    abre
    -0.66
     unparalleled
    -0.65
    ãĤ¼ãĤ¦ãĤ¹
    -0.65
    moil
    -0.64
    pmwiki
    -0.64
    avoid
    -0.64
    andan
    -0.63
    POSITIVE LOGITS
     anymore
    1.69
     anything
    1.36
     nor
    1.26
     any
    1.21
     anybody
    1.17
     anywhere
    1.09
     enough
    1.08
     ANY
    1.05
     slightest
    1.00
     anyone
    0.99
    Act Density 0.240%

    No Known Activations