INDEX
    Explanations

    references to prominent literary figures and their works

    New Auto-Interp
    Negative Logits
    oley
    -0.16
    oland
    -0.16
    วย
    -0.15
    jerne
    -0.15
    åħ
    -0.14
    _season
    -0.14
    iglia
    -0.14
     patri
    -0.14
     Season
    -0.14
    anas
    -0.14
    POSITIVE LOGITS
    EDA
    0.20
    eda
    0.20
     Rowling
    0.19
     JK
    0.18
    dete
    0.17
     Dumbledore
    0.17
    hog
    0.15
    SOR
    0.15
     Warner
    0.15
     Potter
    0.15
    Act Density 0.010%

    No Known Activations