INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     그래도
    0.62
     però
    0.55
    注意的是
    0.54
     gdyż
    0.49
    しかし
    0.48
     toutefois
    0.48
     sillon
    0.46
     Ux
    0.46
     scoperta
    0.46
     мире
    0.45
    POSITIVE LOGITS
    forth
    0.54
     amikor
    0.54
    ledes
    0.51
     عندما
    0.48
    it
    0.48
    although
    0.46
    '
    0.46
    quando
    0.45
    oner
    0.44
    可以说
    0.44
    Act Density 0.013%

    No Known Activations