INDEX
    Explanations

    phrases or sentences ending with the word "followed by"

    New Auto-Interp
    Negative Logits
    thal
    -0.72
    linger
    -0.70
    we
    -0.69
    FR
    -0.68
    Problem
    -0.67
    wrong
    -0.67
    raq
    -0.65
    ran
    -0.65
    winter
    -0.64
    stakes
    -0.64
    POSITIVE LOGITS
     a
    0.91
     an
    0.90
     dozens
    0.86
     another
    0.84
     plenty
    0.84
     numerous
    0.80
     several
    0.79
     considerable
    0.78
     ample
    0.77
     innumerable
    0.77
    Act Density 0.107%

    No Known Activations