INDEX
    Explanations

    personal growth and positive affirmation-related words and phrases

    expressions of emotional states and interactions with people and surroundings

    New Auto-Interp
    Negative Logits
     confir
    -0.72
     predec
    -0.70
    ij士
    -0.62
     notor
    -0.60
     proport
    -0.59
     Instr
    -0.59
    ãĥ¼ãĥĨãĤ£
    -0.58
    Recomm
    -0.58
    ¥ŀ
    -0.57
     conclud
    -0.56
    POSITIVE LOGITS
     huh
    0.97
    !!!!
    0.96
    !?
    0.95
    !
    0.91
    !!!
    0.90
    !:
    0.89
    !!
    0.89
    ?!
    0.89
    ?
    0.85
    !!!!!!!!
    0.82
    Act Density 0.711%

    No Known Activations