INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Changer
    -0.08
    대를
    -0.08
     Sonne
    -0.08
     Vis
    -0.08
     Sok
    -0.07
    -0.07
     Ash
    -0.07
    เต็ม
    -0.07
     tank
    -0.07
    제를
    -0.07
    POSITIVE LOGITS
     breaches
    0.09
     జరిగిన
    0.09
    Occurred
    0.09
     comprom
    0.09
     pudiera
    0.08
     breach
    0.08
    侵犯
    0.08
    公布
    0.08
     confidentiality
    0.08
     offending
    0.08
    Act Density 0.010%

    No Known Activations