INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     законч
    -0.09
    ^{
    -0.08
    ขึ้น
    -0.08
     الن
    -0.07
    -0.07
    inh
    -0.07
     மத
    -0.07
    partial
    -0.07
    acters
    -0.07
     respond
    -0.07
    POSITIVE LOGITS
     вниз
    0.11
    (bottom
    0.11
     asleep
    0.10
     bottom
    0.10
     beneath
    0.10
    _BOTTOM
    0.10
     below
    0.10
     Below
    0.10
     thấp
    0.09
    0.09
    Act Density 0.091%

    No Known Activations