INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bolje
    0.45
     importants
    0.42
     konie
    0.42
    0.42
     courrier
    0.42
     periodistas
    0.42
    อด
    0.41
     जमकर
    0.40
     authorise
    0.40
     dvě
    0.40
    POSITIVE LOGITS
    +
    0.52
    \
    0.47
    velle
    0.46
     t
    0.45
    ].
    0.43
     l
    0.43
     Roth
    0.41
    j
    0.41
     moment
    0.40
     Band
    0.40
    Act Density 0.003%

    No Known Activations