INDEX
    Explanations

    multiple occurrences of punctuation marks

    New Auto-Interp
    Negative Logits
    2
    -0.70
    1
    -0.61
    3
    -0.61
    5
    -0.53
    4
    -0.52
    9
    -0.52
    7
    -0.51
    𝙫
    -0.51
    8
    -0.50
    -
    -0.48
    POSITIVE LOGITS
    .$,
    1.51
    ,:),
    1.25
    ,-,
    1.23
    ,",
    1.23
    ​,
    1.20
    !("{}",
    1.20
    ,<
    1.17
    °,
    1.16
    €,
    1.16
    \%,
    1.15
    Act Density 0.632%

    No Known Activations