INDEX
    Explanations

    punctuation, particularly periods and question marks

    New Auto-Interp
    Negative Logits
    idis
    -0.16
     tags
    -0.15
     share
    -0.15
     click
    -0.14
     aforementioned
    -0.14
     link
    -0.14
     source
    -0.14
     quick
    -0.13
     type
    -0.13
     number
    -0.13
    POSITIVE LOGITS
    Iron
    0.23
     ↵↵
    0.18
    It
    0.17
     Iron
    0.17
     That
    0.17
    That
    0.17
    iron
    0.17
    But
    0.17
    There
    0.16
     Of
    0.16
    Act Density 0.181%

    No Known Activations