INDEX
    Explanations

    mentions of measurement units such as 'g' and 'cm'

    instances of a specific character or symbol

    New Auto-Interp
    Negative Logits
     Polk
    -0.71
     Salman
    -0.70
     Virgin
    -0.68
     Roose
    -0.66
     Salon
    -0.65
     Antar
    -0.65
     Muss
    -0.63
     Jihad
    -0.62
     Barbara
    -0.62
     democracy
    -0.61
    POSITIVE LOGITS
    felt
    1.15
    s
    1.12
    shed
    1.05
    ses
    1.05
    sed
    1.05
    won
    1.03
    sure
    1.01
    should
    0.97
    ved
    0.95
    erent
    0.95
    Act Density 0.279%

    No Known Activations