INDEX
    Explanations

    phrases related to actions and consequences

    references to people and their actions or roles within various contexts

    New Auto-Interp
    Negative Logits
     CrossRef
    -0.63
    rition
    -0.57
    iven
    -0.55
    atever
    -0.54
    ASY
    -0.54
    again
    -0.54
     Vish
    -0.52
    ulkan
    -0.52
    UNCH
    -0.52
    ERAL
    -0.52
    POSITIVE LOGITS
    pires
    0.67
     fitt
    0.66
     paycheck
    0.64
     swear
    0.61
     Kardash
    0.59
     fitness
    0.59
     fart
    0.58
     backgrounds
    0.56
     themselves
    0.55
     liv
    0.55
    Act Density 1.244%

    No Known Activations