INDEX
    Explanations

    proper nouns, specifically names and titles

    New Auto-Interp
    Negative Logits
    ÏĢÏĮ
    -0.06
    ace
    -0.06
    midi
    -0.06
     and
    -0.06
    ly
    -0.05
    acey
    -0.05
    ie
    -0.05
    arsity
    -0.05
    uj
    -0.05
    tp
    -0.05
    POSITIVE LOGITS
    _TA
    0.08
    пÑĢиклад
    0.08
    etim
    0.08
     meis
    0.08
    -et
    0.08
    åĥį
    0.07
     lesbi
    0.07
    Ä±ÅŁÄ±k
    0.07
    ÏĦÏģο
    0.07
     born
    0.07
    Act Density 0.020%

    No Known Activations