INDEX
    Explanations

    verbal expressions indicating outcomes or revelations

    phrases that include the word "turn" in various forms

    New Auto-Interp
    Negative Logits
     resembled
    -0.59
    nces
    -0.58
     resembles
    -0.56
     idiots
    -0.55
    lihood
    -0.53
     dism
    -0.52
     resemble
    -0.51
     starved
    -0.50
     everyday
    -0.50
     anytime
    -0.50
    POSITIVE LOGITS
    llor
    0.71
    erc
    0.66
    aran
    0.65
    erg
    0.64
    rue
    0.64
    chn
    0.64
    ffe
    0.64
    ere
    0.63
    hoff
    0.59
    ede
    0.59
    Act Density 0.356%

    No Known Activations