INDEX
    Explanations

    punctuation marks and formatting variations within the text

    New Auto-Interp
    Negative Logits
     pil
    -0.16
    atego
    -0.15
    ppo
    -0.15
     Harrison
    -0.15
    pell
    -0.15
    ami
    -0.15
     hyp
    -0.15
    âm
    -0.14
    hyp
    -0.14
    rangle
    -0.14
    POSITIVE LOGITS
    anter
    0.18
    iglia
    0.15
    ç£
    0.15
    itre
    0.15
    elerik
    0.14
    oth
    0.14
    prs
    0.14
     Pages
    0.14
    shade
    0.13
     Trident
    0.13
    Act Density 0.060%

    No Known Activations