INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    elman
    -0.07
     flashed
    -0.06
     Middleware
    -0.06
    лян
    -0.06
     Erg
    -0.06
     ramen
    -0.06
    Girls
    -0.06
    ilin
    -0.06
    laps
    -0.06
     underwear
    -0.06
    POSITIVE LOGITS
     artwork
    0.07
     gratuitement
    0.07
    0.06
     procure
    0.06
     upkeep
    0.06
     generally
    0.06
     iets
    0.06
    كس
    0.06
     offenders
    0.06
     Obr
    0.06
    Act Density 0.014%

    No Known Activations