INDEX
    Explanations

    instances of authorship or attributions in a text

    New Auto-Interp
    Negative Logits
    op
    -0.17
    ientes
    -0.16
    ap
    -0.16
    lands
    -0.15
    ir
    -0.14
    am
    -0.14
     fame
    -0.14
     Lands
    -0.14
    avar
    -0.14
    :#
    -0.13
    POSITIVE LOGITS
     admin
    0.16
    ække
    0.16
    edm
    0.15
    chwitz
    0.14
     staff
    0.14
    gni
    0.14
     gauss
    0.14
     diseñador
    0.14
    едак
    0.14
    presso
    0.14
    Act Density 0.061%

    No Known Activations