INDEX
    Explanations

    expressions related to the understanding or fairness of a situation

    expressions of justification or reasonableness

    New Auto-Interp
    Negative Logits
     bowling
    -0.77
     downed
    -0.67
    worm
    -0.65
    craft
    -0.61
     virginity
    -0.61
    ngth
    -0.60
    bows
    -0.59
    infect
    -0.59
    Dur
    -0.59
    stars
    -0.58
    POSITIVE LOGITS
     assume
    0.72
     inference
    0.71
    ensibly
    0.69
     assumption
    0.68
    ATOR
    0.68
     conclude
    0.67
    allery
    0.67
     infer
    0.66
     Luxem
    0.66
     guiActive
    0.66
    Act Density 0.139%

    No Known Activations