INDEX
    Explanations

    references to specific regions or movements associated with social or political issues

    New Auto-Interp
    Negative Logits
    DEN
    -0.66
    chy
    -0.64
    giving
    -0.63
     STATS
    -0.61
    LOAD
    -0.60
     stale
    -0.60
    à©
    -0.60
    mberg
    -0.60
    choes
    -0.59
    STER
    -0.59
    POSITIVE LOGITS
    adia
    1.16
    ribed
    1.07
    adian
    1.02
    ott
    0.97
    inating
    0.91
    henko
    0.90
    otte
    0.90
    pite
    0.89
    inated
    0.89
    ents
    0.88
    Act Density 0.004%

    No Known Activations