INDEX
    Explanations

    phrases related to predictions or future outcomes

    phrases indicating certainty or predictions about future events

    New Auto-Interp
    Negative Logits
    onian
    -0.66
     Franch
    -0.65
    usc
    -0.61
    riott
    -0.61
    Express
    -0.60
    atures
    -0.60
    76561
    -0.59
    osal
    -0.59
    NOW
    -0.59
     Register
    -0.58
    POSITIVE LOGITS
     remembered
    1.03
     judged
    0.93
     harder
    0.91
     sorely
    0.91
     eaten
    0.90
     difficult
    0.89
     phased
    0.88
     tougher
    0.88
     punished
    0.86
     easier
    0.85
    Act Density 0.173%

    No Known Activations