INDEX
    Explanations

    proper nouns, specifically names

    New Auto-Interp
    Negative Logits
    istrovstvÃŃ
    -0.08
    izio
    -0.06
    ownt
    -0.06
    ksen
    -0.06
    CHANT
    -0.06
    gni
    -0.06
     subject
    -0.06
    _formats
    -0.06
    oyal
    -0.06
    ewise
    -0.06
    POSITIVE LOGITS
    onen
    0.08
     buc
    0.07
    еÑĢж
    0.07
    ÄŁÃ¼
    0.07
    STA
    0.07
    _SA
    0.07
    _________________↵↵
    0.07
     सद
    0.07
    arel
    0.07
    olin
    0.06
    Act Density 0.000%

    No Known Activations