INDEX
    Explanations

    instances of authorship or attribution in text

    New Auto-Interp
    Negative Logits
    ContentLoaded
    -0.16
    oras
    -0.16
    nelly
    -0.15
    abay
    -0.15
    isto
    -0.15
    _COPY
    -0.14
    cargo
    -0.14
    loud
    -0.14
    /document
    -0.14
    mada
    -0.14
    POSITIVE LOGITS
    pto
    0.20
     means
    0.20
    rne
    0.18
    laws
    0.17
     virtue
    0.17
    gone
    0.17
    ÅĤa
    0.16
    hra
    0.16
    gg
    0.15
     dint
    0.15
    Act Density 0.149%

    No Known Activations