INDEX
    Explanations

    terms associated with power dynamics and structural influence

    New Auto-Interp
    Negative Logits
    isode
    -0.16
    æĶ¹
    -0.15
    thon
    -0.15
    stk
    -0.15
    355
    -0.14
    .uni
    -0.14
    ighton
    -0.14
    535
    -0.14
    ui
    -0.14
    ceed
    -0.14
    POSITIVE LOGITS
     Paging
    0.14
     Prayer
    0.14
    undry
    0.14
    _via
    0.14
    æĶ¿
    0.13
    coli
    0.13
    ureka
    0.13
     yyn
    0.13
    otty
    0.13
    овÑĸд
    0.13
    Act Density 0.214%

    No Known Activations