INDEX
    Explanations

    phrases introducing alternative or clarifying information

    expressions that reference alternative explanations or viewpoints

    New Auto-Interp
    Negative Logits
    yip
    -0.63
    sson
    -0.62
     overcame
    -0.61
    iste
    -0.60
    akery
    -0.60
    iesta
    -0.59
    onis
    -0.59
    gets
    -0.58
     finally
    -0.58
    Accept
    -0.58
    POSITIVE LOGITS
    worldly
    1.39
    words
    1.05
     respects
    0.91
    wise
    0.85
     words
    0.84
     contexts
    0.83
     vein
    0.81
     manner
    0.79
     jurisdictions
    0.79
     areas
    0.78
    Act Density 0.025%

    No Known Activations