INDEX
    Explanations

    phrases or terms related to critical socio-political discourse

    New Auto-Interp
    Negative Logits
     fortun
    -0.91
     sacrific
    -0.88
     mathemat
    -0.84
     myster
    -0.84
     comr
    -0.79
     disadvant
    -0.78
     suspic
    -0.77
     notor
    -0.75
     hurd
    -0.75
     cryst
    -0.73
    POSITIVE LOGITS
    ï¸ı
    1.32
    ski
    1.11
    tu
    0.95
    tal
    0.95
    tre
    0.94
    sky
    0.93
    sic
    0.92
    tsy
    0.91
    heim
    0.90
    ti
    0.90
    Act Density 0.150%

    No Known Activations