INDEX
    Explanations

    mentions of actions or descriptions related to misconduct

    instances of the word "foul" in various contexts

    New Auto-Interp
    Negative Logits
    _>
    -0.79
    edia
    -0.78
    akeru
    -0.75
    UNCH
    -0.72
    aeda
    -0.70
    hare
    -0.69
    Downloadha
    -0.69
    igslist
    -0.67
    ppelin
    -0.67
    ocobo
    -0.67
    POSITIVE LOGITS
    cery
    0.91
    terness
    0.83
     smelling
    0.82
     foul
    0.82
    s
    0.77
    sie
    0.75
    nesses
    0.73
     misc
    0.71
    weather
    0.70
    ness
    0.67
    Act Density 0.013%

    No Known Activations