INDEX
    Explanations

    expressions of personal reflections and thoughts

    New Auto-Interp
    Negative Logits
    lag
    -0.17
    ĤŃ
    -0.15
    sr
    -0.14
    Mocks
    -0.13
    oro
    -0.13
    ilar
    -0.13
    action
    -0.13
     详æĥħ
    -0.13
    ister
    -0.13
    æĥ
    -0.13
    POSITIVE LOGITS
     thoughts
    0.51
     observations
    0.41
     Thoughts
    0.40
    observations
    0.38
     mus
    0.33
     Observ
    0.32
    Thought
    0.31
     notes
    0.30
     remarks
    0.29
    thought
    0.29
    Act Density 0.262%

    No Known Activations