INDEX
    Explanations

    first-person statements expressing personal thoughts or actions

    expressions of personal reflection and subjective opinions

    New Auto-Interp
    Negative Logits
     conformity
    -0.58
     harms
    -0.55
    MpServer
    -0.53
     delinqu
    -0.51
    forms
    -0.51
     deeds
    -0.50
     subsistence
    -0.49
     Klux
    -0.49
     vitality
    -0.49
     Samar
    -0.48
    POSITIVE LOGITS
     recommend
    0.65
     curious
    0.62
     delve
    0.61
     appreciate
    0.60
    uno
    0.59
     imagine
    0.59
     fond
    0.59
     admittedly
    0.57
     wondered
    0.57
     wondering
    0.56
    Act Density 0.840%

    No Known Activations