INDEX
    Explanations

    quantities or numbers

    questions or statements that inquire about quantities

    New Auto-Interp
    Negative Logits
    hern
    -0.76
    UAL
    -0.72
    afort
    -0.70
    olitics
    -0.70
    ivism
    -0.67
    Mobil
    -0.66
    etry
    -0.66
    avior
    -0.64
    owder
    -0.64
    aband
    -0.63
    POSITIVE LOGITS
     times
    1.23
     thousand
    0.96
     calories
    0.95
     servings
    0.93
     hundred
    0.91
     instances
    0.88
     copies
    0.88
     hours
    0.87
     people
    0.87
     parentheses
    0.87
    Act Density 0.045%

    No Known Activations