INDEX
    Explanations

    repeated instances of the word "per."

    New Auto-Interp
    Negative Logits
    oretical
    -0.16
    ctal
    -0.15
    .ali
    -0.15
    rzy
    -0.15
    们
    -0.14
    eyin
    -0.14
    apı
    -0.14
    precedented
    -0.13
    å±ŀäºİ
    -0.13
    zung
    -0.13
    POSITIVE LOGITS
    ò
    0.30
    alt
    0.29
    cor
    0.28
    cep
    0.28
    ci
    0.27
    seg
    0.27
    don
    0.26
    fe
    0.26
    icol
    0.24
    ifer
    0.24
    Act Density 0.007%

    No Known Activations