INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     nejen
    0.52
     ikke
    0.47
    வதில்லை
    0.47
     olmadığı
    0.43
     bukanlah
    0.43
    少ない
    0.42
     nicht
    0.42
    ুটি
    0.41
    不仅仅
    0.41
    новременно
    0.41
    POSITIVE LOGITS
     I
    0.44
     you
    0.43
     please
    0.42
     we
    0.42
     Please
    0.41
     Hãy
    0.39
     Se
    0.39
     Cont
    0.39
     someone
    0.39
     Someone
    0.38
    Act Density 0.000%

    No Known Activations