INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Was
    0.99
    డింది
    0.91
     አለበት
    0.82
    Does
    0.81
    was
    0.81
     Was
    0.81
     sebuah
    0.81
    Isn
    0.81
    টি
    0.80
     էր
    0.80
    POSITIVE LOGITS
     are
    3.61
     đều
    3.36
     were
    3.19
     ovat
    3.05
    都有
    3.00
     jsou
    2.96
    2.91
    都可以
    2.88
    都是
    2.87
     ஆகியோர்
    2.85
    Act Density 1.019%

    No Known Activations