INDEX
    Explanations

    names of popular music artists and cultural figures

    New Auto-Interp
    Negative Logits
     vit
    -0.14
    agem
    -0.14
    оза
    -0.14
     Bee
    -0.13
     Kis
    -0.13
    emetery
    -0.13
    abama
    -0.13
     Vit
    -0.13
    folk
    -0.13
     coin
    -0.13
    POSITIVE LOGITS
    argas
    0.17
    еж
    0.16
    ighbor
    0.15
    ardon
    0.15
     Ñģол
    0.15
    CDF
    0.14
    locker
    0.14
     اÙĦأخ
    0.14
    Nut
    0.14
    ask
    0.14
    Act Density 0.014%

    No Known Activations