INDEX
    Explanations

    sensitive topics and issues

    New Auto-Interp
    Negative Logits
     Entropy
    0.43
     malos
    0.43
    friendly
    0.41
     useless
    0.41
     autores
    0.39
     friendly
    0.38
     painters
    0.38
    purpose
    0.37
     machining
    0.37
     बदमाशों
    0.37
    POSITIVE LOGITS
     sensitive
    1.40
     Sensitive
    1.20
    Sensitive
    1.20
     controversial
    1.20
    敏感
    1.16
    sensitive
    1.15
     sensitively
    1.09
     حساس
    1.09
     topic
    1.08
     delicate
    1.07
    Act Density 0.019%

    No Known Activations