INDEX
    Explanations

    references to authors or writers

    mentions of authorship or attribution in text

    New Auto-Interp
    Negative Logits
    bia
    -0.88
    MpServer
    -0.80
    asy
    -0.76
    iasm
    -0.76
    aser
    -0.75
    pect
    -0.74
    vous
    -0.73
    ounter
    -0.72
    apor
    -0.71
    ouple
    -0.71
    POSITIVE LOGITS
     virtue
    0.92
     Richard
    0.78
     Wizards
    0.78
     Warren
    0.77
     Hasan
    0.77
     Michele
    0.74
     Robert
    0.74
     Hug
    0.74
     Rod
    0.74
     Juan
    0.74
    Act Density 0.089%

    No Known Activations