INDEX
    Explanations

    themes related to censorship and the suppression of public expression

    New Auto-Interp
    Negative Logits
    atsu
    -0.07
    ÑĢаÑģÑĤ
    -0.07
    รà¸ĵ
    -0.07
    á»Ļ
    -0.07
    á»§
    -0.07
    ĻĤ
    -0.06
     ún
    -0.06
    äºĽ
    -0.06
    isch
    -0.06
    áo
    -0.06
    POSITIVE LOGITS
     perfectly
    0.09
     nowhere
    0.08
     God
    0.06
    Ħĸ
    0.06
     Leg
    0.06
     legitimate
    0.06
    God
    0.06
     
    0.06
     repeatedly
    0.06
    ebra
    0.06
    Act Density 0.038%

    No Known Activations