INDEX
    Explanations

    references to innocence or being portrayed as innocent

    New Auto-Interp
    Negative Logits
     كومونز
    -0.79
     Dowling
    -0.71
     numberWith
    -0.60
     oração
    -0.59
     casket
    -0.59
     useDispatch
    -0.58
     externi
    -0.57
     Vib
    -0.57
     Vle
    -0.56
     Zwie
    -0.56
    POSITIVE LOGITS
     Innocence
    1.47
    innoc
    1.41
    Innoc
    1.40
     Innoc
    1.32
     Innocent
    1.30
     innocence
    1.25
    innocent
    1.25
     innocent
    1.23
     innoc
    1.15
     inocente
    1.06
    Act Density 0.009%

    No Known Activations