INDEX
    Explanations

    names of authors and academic references

    New Auto-Interp
    Negative Logits
    ازÙĩ
    -0.18
    utra
    -0.16
    .ActionListener
    -0.16
    ponent
    -0.16
    hazi
    -0.16
    elsea
    -0.15
    ymoon
    -0.15
    ahat
    -0.15
    ARSER
    -0.14
    اباÙĨ
    -0.14
    POSITIVE LOGITS
     Crafts
    0.19
    å£
    0.17
     Obst
    0.16
     styl
    0.16
     Congressional
    0.15
     Autor
    0.15
    Autor
    0.15
     Rebel
    0.15
    ermal
    0.15
     re
    0.14
    Act Density 0.029%

    No Known Activations