INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     그래서
    0.49
    them
    0.45
    Зна
    0.44
     Зна
    0.43
    0.43
    그래서
    0.43
    相当
    0.42
     İlçesi
    0.42
     Inoltre
    0.42
     Instead
    0.42
    POSITIVE LOGITS
     opposed
    1.38
    ymmet
    1.24
     follows
    1.09
    cribing
    1.08
    sembles
    1.06
     oppose
    1.01
    cribable
    1.01
    sembled
    0.98
     soon
    0.95
    sembling
    0.91
    Act Density 0.198%

    No Known Activations