INDEX
    Explanations

    words that indicate racial or social injustice themes

    New Auto-Interp
    Negative Logits
     consape
    -0.52
     IMHO
    -0.50
    可以说
    -0.46
    可以说是
    -0.46
     dürfte
    -0.46
    ครับ
    -0.45
     IMO
    -0.45
    かなり
    -0.43
     imo
    -0.42
    %@",
    -0.42
    POSITIVE LOGITS
     somehow
    1.32
     Somehow
    0.96
     magically
    0.96
     supposedly
    0.96
     яко
    0.93
     angeb
    0.88
    Somehow
    0.87
     supuestamente
    0.81
     irgendwie
    0.74
     miraculously
    0.73
    Act Density 1.087%

    No Known Activations