INDEX
    Explanations

    references to authority figures and their actions within various contexts

    New Auto-Interp
    Negative Logits
    irut
    -0.15
    ulus
    -0.15
    imb
    -0.15
    iod
    -0.14
    och
    -0.14
    _SPE
    -0.14
    ios
    -0.14
    ari
    -0.14
    390
    -0.14
    gre
    -0.14
    POSITIVE LOGITS
    ÑĢави
    0.18
    lasses
    0.16
    hci
    0.15
    voy
    0.15
    lessly
    0.15
    AILS
    0.14
    -pocket
    0.14
     âĵĺ
    0.14
    жд
    0.14
    ckett
    0.13
    Act Density 0.525%

    No Known Activations