INDEX
    Explanations

    phrases indicating outcomes or revelations

    phrases indicating situations that evolve or reveal themselves over time

    New Auto-Interp
    Negative Logits
    achus
    -0.70
    cham
    -0.68
    afia
    -0.66
    eatures
    -0.66
    brow
    -0.66
    panel
    -0.66
    Previous
    -0.65
    CW
    -0.64
    antha
    -0.64
    colo
    -0.64
    POSITIVE LOGITS
     quite
    0.76
     REALLY
    0.73
     pretty
    0.70
     really
    0.70
    ozy
    0.69
     MUCH
    0.68
     surprisingly
    0.66
     reversed
    0.66
     remarkably
    0.64
     disastrous
    0.64
    Act Density 0.100%

    No Known Activations