INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ŭ
    -0.96
    𝗆
    -0.91
    liebe
    -0.86
    Reduc
    -0.84
    stützung
    -0.84
    Referenties
    -0.84
    unjungi
    -0.84
    ('/:
    -0.82
    welijk
    -0.82
    -0.82
    POSITIVE LOGITS
     used
    8.38
    used
    5.59
    Used
    5.53
     Used
    5.38
     digunakan
    4.47
     utilisé
    4.31
    USED
    4.25
     USED
    4.25
     usado
    3.91
     utilisés
    3.91
    Act Density 0.607%

    No Known Activations