INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     colorés
    -0.69
     démocr
    -0.68
     supérieurs
    -0.68
     rayures
    -0.68
     ouverts
    -0.65
     genoux
    -0.64
     enfans
    -0.62
     prêts
    -0.62
     ferons
    -0.62
     armées
    -0.61
    POSITIVE LOGITS
     >=",
    0.71
     cherchés
    0.69
     متعلقه
    0.67
    ResponseWriter
    0.64
     оригіналу
    0.61
    MIDDLEWARE
    0.58
    LookAnd
    0.56
    хьтан
    0.55
    eroen
    0.54
    ंदीखरीदारी
    0.53
    Act Density 0.491%

    No Known Activations