INDEX
    Explanations

    content that is classified as offensive or contains warnings about sensitive material

    content warnings and disclaimers about offensive, inappropriate, or restricted material.

    New Auto-Interp
    Negative Logits
    DockStyle
    -0.33
     cherchez
    -0.30
     bingung
    -0.29
    γνω
    -0.29
    GUILayout
    -0.29
     battre
    -0.29
    hubung
    -0.28
     envy
    -0.26
     dogged
    -0.26
    说明
    -0.26
    POSITIVE LOGITS
    sensitive
    1.02
     objectionable
    1.01
     offensive
    1.01
     censored
    0.99
    offensive
    0.99
     sensitive
    0.96
     inappropriate
    0.96
    Offensive
    0.94
    ensitive
    0.93
    Sensitive
    0.92
    Act Density 0.427%

    No Known Activations