INDEX
    Explanations

    phrases discussing power dynamics and victimization in societal contexts

    New Auto-Interp
    Negative Logits
     somewhat
    -0.84
    somewhat
    -0.81
     trochu
    -0.77
     biraz
    -0.77
     trochę
    -0.75
     Somewhat
    -0.75
     nieco
    -0.75
     agak
    -0.72
    Somewhat
    -0.68
     lidt
    -0.67
    POSITIVE LOGITS
     unbelievably
    1.10
     absolutely
    1.09
     absolutamente
    1.07
     utterly
    1.04
     абсолютно
    1.02
     literally
    1.01
     incredibly
    1.00
    absolutely
    0.98
     extrêmement
    0.97
     perfectly
    0.96
    Act Density 2.489%

    No Known Activations