INDEX
    Explanations

    punctuation marks, specifically periods

    New Auto-Interp
    Negative Logits
    .
    -0.17
    akens
    -0.16
    hsi
    -0.15
    [
    -0.15
     commitment
    -0.15
    *
    -0.15
    itu
    -0.14
    zh
    -0.14
    it
    -0.14
    ul
    -0.14
    POSITIVE LOGITS
    5
    0.26
    8
    0.24
    75
    0.23
    7
    0.22
    6
    0.21
    9
    0.21
    0
    0.20
    4
    0.19
    3
    0.18
    85
    0.18
    Act Density 0.083%

    No Known Activations