INDEX
    Explanations

    phrases or words expressing difficulty or disbelief

    phrases that indicate difficulty or challenges in understanding or explaining concepts

    New Auto-Interp
    Negative Logits
    afety
    -0.86
    ements
    -0.66
    plates
    -0.65
    Dialogue
    -0.65
    erville
    -0.64
    leys
    -0.64
    RIPT
    -0.63
    mails
    -0.62
     Needs
    -0.61
    cand
    -0.61
    POSITIVE LOGITS
     achieve
    1.09
     accomplish
    1.05
     distinguish
    1.04
     obtain
    1.04
     attain
    1.02
     imagine
    1.02
     reconcile
    1.01
     convince
    1.00
     conceive
    0.99
     penetrate
    0.97
    Act Density 0.090%

    No Known Activations