INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    178
    -0.07
    entered
    -0.07
     شهرد
    -0.07
    луб
    -0.07
     alleged
    -0.06
     heats
    -0.06
    ouncements
    -0.06
     allergy
    -0.06
    _near
    -0.06
     Avust
    -0.06
    POSITIVE LOGITS
     cope
    0.15
     coping
    0.14
     pratique
    0.06
     comprehend
    0.06
     manage
    0.06
     mediation
    0.06
    0.06
     Fed
    0.06
    /templates
    0.06
    фор
    0.06
    Act Density 0.004%

    No Known Activations