INDEX
    Explanations

    references to power dynamics and authority

    New Auto-Interp
    Negative Logits
     à¹Ĩ
    -0.15
    ashtra
    -0.15
    ijke
    -0.15
    msgid
    -0.15
    赫
    -0.14
    abwe
    -0.14
     selber
    -0.14
    itech
    -0.14
    anson
    -0.14
    elper
    -0.14
    POSITIVE LOGITS
    801
    0.15
     Voll
    0.14
     denn
    0.14
     dual
    0.14
     Fritz
    0.14
     appearances
    0.13
    /Page
    0.13
    sf
    0.13
     appearance
    0.13
     sf
    0.13
    Act Density 0.001%

    No Known Activations