INDEX
    Explanations

    isolated punctuation marks, particularly periods

    New Auto-Interp
    Negative Logits
    <unused16>
    -0.86
    <unused79>
    -0.85
    <unused42>
    -0.85
    <unused74>
    -0.85
    <unused68>
    -0.85
    <unused23>
    -0.85
    <unused8>
    -0.85
    <unused17>
    -0.85
    <unused14>
    -0.85
    <pad>
    -0.85
    POSITIVE LOGITS
    .
    0.45
    ↵↵
    0.37
    mongoose
    0.35
    gen
    0.35
    0.35
    0.35
     G
    0.33
    )
    0.32
    '
    0.32
    {
    0.31
    Act Density 0.000%

    No Known Activations