INDEX
    Explanations

    references to people, particularly through the use of names and titles

    New Auto-Interp
    Negative Logits
    icari
    -0.17
    olar
    -0.17
    rief
    -0.16
    uron
    -0.15
    .UIManager
    -0.15
    æº
    -0.14
    leigh
    -0.13
    fak
    -0.13
    inem
    -0.13
     pow
    -0.13
    POSITIVE LOGITS
    igos
    0.18
    uzzi
    0.15
    Ñĥли
    0.15
    ÑģÑĤÑĭ
    0.14
    stein
    0.14
    ãĥ¥ãĥ¼
    0.14
    lia
    0.14
    luk
    0.14
    631
    0.14
    oday
    0.14
    Act Density 0.003%

    No Known Activations