INDEX
    Explanations

    overall rankings and supremacy

    New Auto-Interp
    Negative Logits
     bertujuan
    0.44
     اصلی
    0.39
     اصلي
    0.39
     Introduce
    0.38
     Successful
    0.38
     بجائے
    0.38
    Successful
    0.37
     Instead
    0.37
     አስፈላጊ
    0.37
    แทน
    0.37
    POSITIVE LOGITS
     arguably
    1.13
     rankings
    1.00
     consistently
    0.99
     ranking
    0.98
     supremacy
    0.97
     ranked
    0.96
     overall
    0.95
     contenders
    0.94
    argu
    0.93
     contender
    0.89
    Act Density 0.048%

    No Known Activations