INDEX
    Explanations

    mentions of musical instruments or genres

    New Auto-Interp
    Negative Logits
    ync
    -0.15
    enthal
    -0.15
    ät
    -0.14
    licht
    -0.14
    obel
    -0.14
    chrift
    -0.14
    oup
    -0.14
    .Atomic
    -0.14
    imple
    -0.14
     Ñĥгл
    -0.14
    POSITIVE LOGITS
    иÑĢов
    0.16
    æļ
    0.16
    Ñı
    0.15
    arie
    0.15
    ÑĥÑĩ
    0.15
    ÑĥÑģ
    0.15
    ť
    0.14
     Hall
    0.14
    Merit
    0.14
    IME
    0.14
    Act Density 0.003%

    No Known Activations