INDEX
    Explanations

    the attribution of authorship in texts

    New Auto-Interp
    Negative Logits
    ukes
    -0.16
    าà¸ĩ
    -0.16
    lej
    -0.15
    ergency
    -0.15
    erty
    -0.15
    oods
    -0.14
    ique
    -0.14
    567
    -0.14
    effect
    -0.14
     prá
    -0.14
    POSITIVE LOGITS
    fait
    0.18
    uras
    0.17
    à¤Ĺल
    0.16
    unix
    0.16
    ãĤ¿ãĥ«
    0.16
    readcr
    0.15
    á»Ĩ
    0.15
     Schneider
    0.14
    ìĦŃ
    0.14
    icont
    0.14
    Act Density 0.026%

    No Known Activations