INDEX
    Explanations

    phrases or sentences that indicate strong emphasis or comparison

    New Auto-Interp
    Negative Logits
    axter
    -0.65
    ãĤ¿
    -0.63
    kefeller
    -0.63
    bath
    -0.63
    icides
    -0.61
    kai
    -0.61
    HCR
    -0.61
    iday
    -0.60
    tions
    -0.58
    undo
    -0.58
    POSITIVE LOGITS
     partially
    0.78
     partly
    0.78
     SOME
    0.73
    uner
    0.69
     toler
    0.68
     pretend
    0.67
     temporarily
    0.66
    hap
    0.65
     theoretically
    0.65
    lik
    0.65
    Act Density 1.122%

    No Known Activations