INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    WN
    -0.07
     Ov
    -0.07
     thắng
    -0.07
    င်း
    -0.07
     WOW
    -0.07
     nødvend
    -0.07
     Rus
    -0.07
    -0.07
     স্ব
    -0.07
    ونس
    -0.07
    POSITIVE LOGITS
    previous
    0.11
     zuvor
    0.11
     sebelumnya
    0.10
     предыдущ
    0.10
     previous
    0.10
     tẹlẹ
    0.09
    0.09
     previously
    0.09
    .previous
    0.09
     અગાઉ
    0.09
    Act Density 0.020%

    No Known Activations