INDEX
    Explanations

    numbers indicating points being scored in different contexts

    instances of the word "score."

    New Auto-Interp
    Negative Logits
    agan
    -0.81
    etheless
    -0.61
     hind
    -0.61
    lly
    -0.60
    perial
    -0.58
     Society
    -0.57
    IDA
    -0.57
    conn
    -0.57
    por
    -0.57
     pleas
    -0.56
    POSITIVE LOGITS
     score
    1.21
     scores
    1.08
     Score
    1.01
    ificant
    0.93
    card
    0.91
    keeper
    0.88
     Scores
    0.87
     scored
    0.83
    cards
    0.80
    Score
    0.80
    Act Density 0.010%

    No Known Activations