INDEX
    Explanations

    statements reporting feelings or experiences

    New Auto-Interp
    Negative Logits
    ysz
    -0.17
    loit
    -0.15
    iena
    -0.15
    aines
    -0.15
    odel
    -0.14
    rary
    -0.14
    edback
    -0.14
    idot
    -0.14
    thood
    -0.14
    oyer
    -0.14
    POSITIVE LOGITS
    anean
    0.17
     conf
    0.16
     My
    0.16
    CTS
    0.15
     Nu
    0.14
     blow
    0.14
     nou
    0.14
    Nu
    0.14
     pressures
    0.14
     nu
    0.14
    Act Density 0.186%

    No Known Activations