INDEX
    Explanations

    mentions of negative experiences or events

    terms related to systematic oppression and political manipulation

    New Auto-Interp
    Negative Logits
    usercontent
    -0.52
    laws
    -0.51
    ramid
    -0.50
    Ire
    -0.49
    ufact
    -0.48
    ":[
    -0.46
     regul
    -0.46
     adolesc
    -0.46
    doors
    -0.46
    ]),
    -0.46
    POSITIVE LOGITS
    onge
    0.54
     Description
    0.52
     ·
    0.51
     [/
    0.51
    escription
    0.51
     âľ
    0.50
    DragonMagazine
    0.50
     START
    0.48
    inka
    0.47
     };
    0.47
    Act Density 1.806%

    No Known Activations