INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Trusted
    -0.09
     довер
    -0.09
     Dishwasher
    -0.09
     confiança
    -0.08
     confianza
    -0.08
     confiance
    -0.08
     כול
    -0.08
    Trusted
    -0.08
    -0.08
    Shim
    -0.07
    POSITIVE LOGITS
     sexu
    0.11
     interracial
    0.10
     bodily
    0.10
    裸体
    0.10
     finishes
    0.09
     expressly
    0.09
     intercourse
    0.09
     explicitly
    0.09
     seksuele
    0.08
     सेक्स
    0.08
    Act Density 0.016%

    No Known Activations