INDEX
    Explanations

    proper nouns, particularly names associated with historical figures and notable individuals

    New Auto-Interp
    Negative Logits
    ries
    -0.17
    229
    -0.15
    umed
    -0.15
    827
    -0.15
    sis
    -0.15
    olia
    -0.15
    ordes
    -0.14
    orne
    -0.14
    orum
    -0.14
    ies
    -0.13
    POSITIVE LOGITS
    uai
    0.15
     fitte
    0.15
    æķ·
    0.14
     zad
    0.14
    /type
    0.14
     Tos
    0.14
    Ģìŀ¥
    0.13
    hots
    0.13
    inho
    0.13
    ùy
    0.13
    Act Density 0.113%

    No Known Activations