INDEX
    Explanations

    phrases starting with "While"

    phrases that introduce contrasting ideas or conditions

    New Auto-Interp
    Negative Logits
    ISE
    -0.81
    isable
    -0.79
    ise
    -0.77
    aer
    -0.73
    irs
    -0.71
    omet
    -0.70
    iotic
    -0.68
    atron
    -0.68
    romeda
    -0.68
    arse
    -0.67
    POSITIVE LOGITS
     researching
    0.87
     acknowledging
    0.81
     browsing
    0.80
     discussing
    0.77
     compiling
    0.76
     agreeing
    0.76
     commenting
    0.73
     attending
    0.73
    catentry
    0.70
     evaluating
    0.70
    Act Density 0.034%

    No Known Activations