INDEX
    Explanations

    sexually explicit texts

    New Auto-Interp
    Negative Logits
     environmental
    -0.07
     launches
    -0.07
    ken
    -0.07
     Kop
    -0.07
    л
    -0.07
    -0.06
     agency
    -0.06
     جان
    -0.06
    ffer
    -0.06
     Institutes
    -0.06
    POSITIVE LOGITS
    Prepare
    0.06
     snapchat
    0.06
    0.06
     PUS
    0.06
    нів
    0.06
     NON
    0.06
     evacuate
    0.06
     Sno
    0.06
    براير
    0.06
    =Math
    0.06
    Act Density 0.040%

    No Known Activations