INDEX
    Explanations

    references to physical and sexual abuse or assault, particularly involving manipulation, threats, and force

    New Auto-Interp
    Negative Logits
     olx
    -1.23
     haup
    -1.20
     lara
    -1.18
     lola
    -1.18
     mef
    -1.17
     fta
    -1.16
     ibiza
    -1.16
     sofia
    -1.15
     lamborghini
    -1.12
     lidl
    -1.11
    POSITIVE LOGITS
    almaz
    0.59
     ErrIntOverflow
    0.57
    ALLENGE
    0.56
    figer
    0.55
    /**
    0.55
     الدولى
    0.53
    TagMode
    0.53
     @"/
    0.52
    LabelTagHelper
    0.52
    govine
    0.51
    Act Density 0.568%

    No Known Activations