INDEX
    Explanations

    words related to social or political issues and actions, especially those with negative connotations

    terms associated with anti-arguments or opposition to various issues

    New Auto-Interp
    Negative Logits
     Rue
    -0.70
     dots
    -0.66
     Mock
    -0.65
    KNOWN
    -0.65
    staking
    -0.62
     STATS
    -0.62
     McH
    -0.61
     notebooks
    -0.60
     mit
    -0.59
     Nun
    -0.59
    POSITIVE LOGITS
    usterity
    0.99
    roleum
    0.86
    otic
    0.86
    otics
    0.84
    byter
    0.80
    aphael
    0.79
    ilib
    0.78
    osher
    0.78
    amacare
    0.76
    ucl
    0.76
    Act Density 0.132%

    No Known Activations