INDEX
    Explanations

    punctuation marks

    punctuation marks, particularly commas

    New Auto-Interp
    Negative Logits
     corrid
    -0.72
    gow
    -0.71
    taboola
    -0.65
    robe
    -0.64
    rique
    -0.61
     fronts
    -0.60
     seizure
    -0.59
    ruck
    -0.59
    rients
    -0.59
    ney
    -0.58
    POSITIVE LOGITS
    but
    0.89
     uh
    0.88
     um
    0.86
    BUT
    0.81
     but
    0.80
     except
    0.80
     albeit
    0.79
     alas
    0.78
     namely
    0.76
     oh
    0.71
    Act Density 0.260%

    No Known Activations