INDEX
    Explanations

    instances of inappropriate sexual relationships or misconduct involving teachers and students

    New Auto-Interp
    Negative Logits
    æķ·
    -0.17
    undi
    -0.15
    hetto
    -0.15
    бÑĥ
    -0.14
    estruct
    -0.14
    ontrol
    -0.14
    ouston
    -0.14
    shiv
    -0.14
     baiser
    -0.14
    _embed
    -0.14
    POSITIVE LOGITS
    oun
    0.17
     harm
    0.15
    aser
    0.15
    ouns
    0.14
    407
    0.14
    951
    0.14
    vik
    0.13
    ÙħاÙħ
    0.13
    è¡
    0.13
    Ïĥκε
    0.13
    Act Density 0.234%

    No Known Activations