INDEX
    Explanations

    phrases introducing or explaining concepts or ideas

    phrases indicating consequences, benefits, or explanations in a discussion

    New Auto-Interp
    Negative Logits
    eric
    -0.68
    onics
    -0.62
    hell
    -0.60
    psc
    -0.59
    onement
    -0.58
    raphic
    -0.58
    lems
    -0.57
    raid
    -0.56
    borg
    -0.56
    ppers
    -0.55
    POSITIVE LOGITS
     involves
    0.87
     relates
    0.85
     overlooked
    0.85
     limitation
    0.77
     includes
    0.75
     arises
    0.74
     excludes
    0.73
     pecul
    0.71
     contributing
    0.70
     distinguishes
    0.70
    Act Density 0.142%

    No Known Activations