INDEX
    Explanations

    pronouns that indicate second-person and first-person perspectives

    New Auto-Interp
    Negative Logits
     sobie
    -0.18
    box
    -0.15
    ually
    -0.15
    spor
    -0.14
     Forge
    -0.14
     vlas
    -0.14
     Ortiz
    -0.14
    ildo
    -0.14
     mac
    -0.14
    paramref
    -0.14
    POSITIVE LOGITS
    esson
    0.16
    è³
    0.16
     Pla
    0.16
    chooser
    0.15
     tabs
    0.15
    ekli
    0.15
    rong
    0.15
    _tokenize
    0.15
    andler
    0.14
     alive
    0.14
    Act Density 0.025%

    No Known Activations