INDEX
    Explanations

    words related to governance, politics, and societal issues

    New Auto-Interp
    Negative Logits
    chwitz
    -0.69
    andise
    -0.68
     Indra
    -0.67
    ADRA
    -0.66
    reon
    -0.65
    CLASSIFIED
    -0.63
    HEAD
    -0.61
    ONSORED
    -0.59
     indo
    -0.58
     hotly
    -0.58
    POSITIVE LOGITS
     (<
    1.12
    pox
    0.99
     minded
    0.94
    azaki
    0.83
    minded
    0.80
    folk
    0.80
    case
    0.80
    est
    0.80
    tiny
    0.76
    entary
    0.76
    Act Density 2.234%

    No Known Activations