INDEX
    Explanations

    punctuation marks and their frequency in the text

    Tokens after commas that introduce contrasting information

    contrast and continuation

    New Auto-Interp
    Negative Logits
    ,
    -0.94
    -0.76
    -0.73
    ...
    -0.69
    ?
    -0.64
    .
    -0.63
    :
    -0.60
    -0.59
    '
    -0.59
    /
    -0.55
    POSITIVE LOGITS
     etc
    1.16
     however
    1.11
     }}$,
    1.07
     albeit
    0.99
     namely
    0.91
     including
    0.90
     which
    0.89
     yaitu
    0.89
    =,
    0.89
     though
    0.88
    Act Density 2.749%

    No Known Activations