INDEX
    Explanations

    references to a specific individual, likely involved in notable events or discussions

    New Auto-Interp
    Negative Logits
    avers
    -0.17
    verse
    -0.16
    ails
    -0.16
    ENTE
    -0.15
    achts
    -0.15
    tank
    -0.14
    ugin
    -0.14
    herits
    -0.14
    нÑĮ
    -0.14
    usk
    -0.14
    POSITIVE LOGITS
    oping
    0.20
     sco
    0.19
    oop
    0.18
    oped
    0.18
    oters
    0.18
     Sco
    0.17
    oby
    0.17
    oter
    0.16
    eye
    0.16
    rido
    0.15
    Act Density 0.007%

    No Known Activations