INDEX
    Explanations

    pronouns related to self-reference and identity

    New Auto-Interp
    Negative Logits
    atch
    -0.14
    ibus
    -0.14
    mond
    -0.14
    زÙĬØ©
    -0.14
    illion
    -0.14
    illos
    -0.14
    .FontStyle
    -0.14
    lines
    -0.14
    ond
    -0.14
    leans
    -0.13
    POSITIVE LOGITS
    -même
    0.24
    zelf
    0.21
    zÅij
    0.16
    ipsis
    0.16
    ĵ
    0.15
    376
    0.15
    ORT
    0.14
    362
    0.14
    enek
    0.14
    762
    0.14
    Act Density 0.049%

    No Known Activations