INDEX
    Explanations

    personal pronouns, particularly "I"

    New Auto-Interp
    Negative Logits
    oca
    -0.16
    apol
    -0.15
    volt
    -0.15
    vik
    -0.15
    lili
    -0.14
    ysi
    -0.14
    让æĪij
    -0.14
    AllowAnonymous
    -0.14
    ën
    -0.13
    ezier
    -0.13
    POSITIVE LOGITS
    SED
    0.17
    orch
    0.15
     think
    0.15
    eyen
    0.15
     personally
    0.15
    rium
    0.14
     Lazar
    0.14
     Johannes
    0.14
    IMP
    0.14
    quin
    0.14
    Act Density 0.105%

    No Known Activations