INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Flair
    -0.66
     gib
    -0.65
     rhetor
    -0.65
     ceasefire
    -0.63
     irrever
    -0.61
     Verdict
    -0.61
     toJson
    -0.60
    pidou
    -0.60
     Inoc
    -0.59
     ILogger
    -0.59
    POSITIVE LOGITS
    تقاوى
    0.71
     remplacement
    0.65
     plainte
    0.64
     CreateTagHelper
    0.61
     compréhen
    0.59
     naturels
    0.57
     pauvres
    0.57
     dulu
    0.56
    bní
    0.56
    tuan
    0.56
    Act Density 0.086%

    No Known Activations