INDEX
    Explanations

    terms related to politics and policy, with a focus on social justice, mental health, and government actions

    New Auto-Interp
    Negative Logits
    ãĤ¨ãĥ«
    -0.96
    éŃĶ
    -0.83
    SN
    -0.81
    ãĥĥ
    -0.81
    stown
    -0.79
    ãĥĥãĥī
    -0.77
    GMT
    -0.75
    ãĥ¥
    -0.74
    odox
    -0.74
    gro
    -0.74
    POSITIVE LOGITS
     superpower
    0.83
    eers
    0.76
     equivalents
    0.72
     nonprofits
    0.69
     pamph
    0.68
     dystop
    0.68
     issues
    0.68
     optimization
    0.68
     guru
    0.67
     satire
    0.67
    Act Density 2.078%

    No Known Activations