INDEX
    Explanations

    words related to political systems and government activities

    New Auto-Interp
    Negative Logits
    gur
    -0.76
    yz
    -0.72
    arta
    -0.70
    lain
    -0.68
    ingham
    -0.64
    ZA
    -0.61
    pelling
    -0.60
    tek
    -0.59
    LV
    -0.59
    ammy
    -0.58
    POSITIVE LOGITS
    rils
    1.22
    entious
    1.14
    ril
    0.97
     toward
    0.96
    entimes
    0.91
     towards
    0.88
     grav
    0.80
    erest
    0.76
    erer
    0.75
     favour
    0.75
    Act Density 0.018%

    No Known Activations