INDEX
    Explanations

    phrases related to past events or experiences in a chronological order

    phrases related to sequences of events or experiences

    New Auto-Interp
    Negative Logits
    rous
    -0.64
     suspense
    -0.64
    aston
    -0.64
    Previous
    -0.62
    framework
    -0.61
     anticipation
    -0.61
     fairness
    -0.60
    pre
    -0.60
    Definition
    -0.60
    Condition
    -0.60
    POSITIVE LOGITS
     switched
    0.90
     abruptly
    0.85
     franch
    0.75
     fateful
    0.72
     branching
    0.70
     shifted
    0.69
     succumbed
    0.69
    bart
    0.68
     relent
    0.67
    icably
    0.66
    Act Density 0.296%

    No Known Activations