INDEX
    Explanations

    phrases indicating the impact or influence of actions or events on individuals

    phrases indicating the effects or consequences of actions

    New Auto-Interp
    Negative Logits
    bow
    -0.70
    cele
    -0.64
    majority
    -0.61
    ilings
    -0.61
    mage
    -0.58
    ban
    -0.58
     Ce
    -0.58
    bill
    -0.57
     forthcoming
    -0.57
    guide
    -0.57
    POSITIVE LOGITS
     raining
    0.86
     tremend
    0.77
    bnb
    0.76
     alot
    0.73
     easier
    0.69
     CTR
    0.68
    chwitz
    0.66
     doub
    0.65
    ometimes
    0.63
     interesting
    0.63
    Act Density 0.222%

    No Known Activations