INDEX
    Explanations

    phrases related to authoritarianism and dictatorships

    terms related to oppression and its effects

    New Auto-Interp
    Negative Logits
    bys
    -0.85
    rog
    -0.76
     Freak
    -0.75
    vernment
    -0.72
    ces
    -0.70
    ereo
    -0.69
    blers
    -0.69
     Latvia
    -0.66
    soDeliveryDate
    -0.66
    busters
    -0.65
    POSITIVE LOGITS
    ciating
    0.90
     desp
    0.86
    anguage
    0.80
    inement
    0.80
    iculty
    0.77
    endon
    0.77
    itious
    0.77
    phia
    0.76
    essage
    0.72
    igham
    0.71
    Act Density 0.017%

    No Known Activations