INDEX
    Explanations

    content that contains offensive language or illegal material

    offensive or hateful content

    New Auto-Interp
    Negative Logits
     Dio
    -0.45
     Sino
    -0.44
     pk
    -0.44
     IBOutlet
    -0.43
     program
    -0.43
     IFF
    -0.43
     Ud
    -0.43
     micro
    -0.42
     NIS
    -0.42
     back
    -0.42
    POSITIVE LOGITS
     ainfi
    0.53
    verwijspagina
    0.52
    guiente
    0.52
     enfermed
    0.52
     Jurí
    0.51
     humanidade
    0.48
     Infór
    0.47
     Púb
    0.47
     thérape
    0.47
     ErrIntOverflow
    0.47
    Act Density 0.110%

    No Known Activations