INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     destino
    -0.06
    -outs
    -0.06
    -0.06
    /social
    -0.06
     která
    -0.06
     درون
    -0.06
    67
    -0.06
    *\
    -0.06
     Βασ
    -0.06
    POSITIVE LOGITS
     Supreme
    0.21
     supreme
    0.14
     Paramount
    0.10
    su
    0.09
     robe
    0.08
    ount
    0.08
     paramount
    0.07
     Верхов
    0.07
    bmp
    0.07
     Конститу
    0.07
    Act Density 0.004%

    No Known Activations