INDEX
    Explanations

    references to hierarchical structures and power dynamics in social contexts

    New Auto-Interp
    Negative Logits
    ortion
    -0.68
    cast
    -0.67
    hops
    -0.67
    delay
    -0.66
    reddits
    -0.65
    ãĥį
    -0.65
    rav
    -0.64
    ratulations
    -0.64
    itone
    -0.64
    faced
    -0.64
    POSITIVE LOGITS
     afar
    1.56
     whence
    1.07
     inception
    1.06
     scratch
    1.00
     outset
    0.92
     standpoint
    0.92
     infancy
    0.91
     conception
    0.90
     inside
    0.86
     cradle
    0.83
    Act Density 0.099%

    No Known Activations