INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ವಿವರ
    -0.09
     Sys
    -0.09
     유명
    -0.08
     싶은
    -0.08
     traditions
    -0.08
     Nicola
    -0.08
    Sys
    -0.08
     gewend
    -0.08
     വിശദ
    -0.08
     yatır
    -0.08
    POSITIVE LOGITS
    禁止
    0.10
     taboo
    0.09
     offens
    0.09
     avoidance
    0.09
     prohibition
    0.09
    0.08
     unrestricted
    0.08
     freely
    0.08
    文字
    0.08
     offensive
    0.08
    Act Density 0.004%

    No Known Activations