INDEX
    Explanations

    names of individuals and references to specific people

    New Auto-Interp
    Negative Logits
    ãģ¾ãģŁ
    -0.18
    th
    -0.18
    er
    -0.17
    jedn
    -0.16
    ãģĤãģ£ãģŁ
    -0.16
    rosso
    -0.15
    eric
    -0.15
    kü
    -0.15
    ä¿Ĺ
    -0.15
    aug
    -0.15
    POSITIVE LOGITS
    plorer
    0.17
    sson
    0.17
    ilda
    0.15
    akedirs
    0.14
    son
    0.14
    -pane
    0.14
     Skywalker
    0.14
    еÑģÑĮ
    0.13
    ernals
    0.13
    EDIA
    0.13
    Act Density 0.808%

    No Known Activations