INDEX
    Explanations

    questions related to self-reflection and self-doubt

    New Auto-Interp
    Negative Logits
    inery
    -0.78
    iHUD
    -0.76
    ©¶æ¥µ
    -0.74
    abba
    -0.69
    alde
    -0.67
     pione
    -0.67
    20439
    -0.67
    abor
    -0.66
    tails
    -0.65
    jong
    -0.64
    POSITIVE LOGITS
    ?
    1.98
    ?:
    1.86
    ?"
    1.82
    ?'
    1.82
    ?)
    1.76
    ?),
    1.72
    ?",
    1.72
    ?).
    1.70
    ?!
    1.70
    ...?
    1.69
    Act Density 2.787%

    No Known Activations