INDEX
    Explanations

    words associated with power dynamics and control in societal and historical contexts

    New Auto-Interp
    Negative Logits
     '\\;'
    -0.46
    Życiorys
    -0.44
     carriers
    -0.40
     KERN
    -0.40
     Biôgrafia
    -0.40
     snippetHide
    -0.40
    🇶
    -0.39
     étoient
    -0.39
    tempts
    -0.38
     Temper
    -0.38
    POSITIVE LOGITS
    ScreenState
    0.45
     Administrativna
    0.44
     nonUne
    0.44
    脚注の使い方
    0.42
     informée
    0.42
    AddTagHelper
    0.41
    outWeight
    0.38
    SharedCtor
    0.38
    marck
    0.38
    kháu
    0.38
    Act Density 0.260%

    No Known Activations