INDEX
    Explanations

    instances of identification and self-reference

    New Auto-Interp
    Negative Logits
    elman
    -0.16
    uries
    -0.16
    lund
    -0.16
     Barrett
    -0.15
    kowski
    -0.14
    asta
    -0.14
    icl
    -0.13
    ombie
    -0.13
    anche
    -0.13
    ANCH
    -0.13
    POSITIVE LOGITS
    озем
    0.15
    804
    0.15
    abor
    0.14
    olor
    0.14
    /design
    0.14
    agnost
    0.14
    ourse
    0.14
    witter
    0.13
     Fen
    0.13
    cheiden
    0.13
    Act Density 0.026%

    No Known Activations