INDEX
    Explanations

    instances of authorship or attribution in the text

    New Auto-Interp
    Negative Logits
    arer
    -0.16
    gang
    -0.15
    fram
    -0.14
    '])?
    -0.14
     heat
    -0.14
     Heat
    -0.14
    iffer
    -0.13
    ecies
    -0.13
     eskort
    -0.13
    Heat
    -0.13
    POSITIVE LOGITS
    antro
    0.16
     PoÄįet
    0.15
     stag
    0.15
    AGR
    0.15
    λα
    0.14
     αγ
    0.14
    445
    0.13
    RIORITY
    0.13
    λά
    0.13
    Ñĩен
    0.13
    Act Density 0.032%

    No Known Activations