INDEX
    Explanations

    numbers at the beginning of sentences and the symbol ':' in a text

    punctuation and formatting symbols, particularly colons

    New Auto-Interp
    Negative Logits
    etheless
    -0.71
     disliked
    -0.71
     referees
    -0.69
     receipt
    -0.66
    ibur
    -0.65
     evapor
    -0.65
     brewed
    -0.63
     stagn
    -0.63
     pads
    -0.62
     diver
    -0.62
    POSITIVE LOGITS
     Exactly
    0.93
     Tonight
    0.86
     Yeah
    0.82
     Well
    0.80
    Correct
    0.78
     Wow
    0.76
     Alright
    0.75
     Explain
    0.75
    Yeah
    0.74
    Thirty
    0.74
    Act Density 0.043%

    No Known Activations