INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     pozor
    -0.07
     kes
    -0.07
    blas
    -0.07
     hemen
    -0.06
    -0.06
     satin
    -0.06
    -tier
    -0.06
    pond
    -0.06
     Ps
    -0.06
    žen
    -0.06
    POSITIVE LOGITS
    (optarg
    0.08
    ITIONAL
    0.06
     pNode
    0.06
    (food
    0.06
    ��
    0.06
    .receive
    0.06
     世界
    0.06
    )b
    0.06
     Mattis
    0.06
     المملكة
    0.06
    Act Density 0.002%

    No Known Activations