INDEX
    Explanations

    dates and events like games, attacks, or sentences

    New Auto-Interp
    Negative Logits
    laim
    -0.79
    anan
    -0.70
    cientious
    -0.69
    etimes
    -0.69
    */(
    -0.68
    intend
    -0.66
    kef
    -0.65
    cules
    -0.63
    geries
    -0.63
    certain
    -0.63
    POSITIVE LOGITS
    east
    0.76
     onwards
    0.65
     onward
    0.63
    04
    0.63
     coasts
    0.61
     drills
    0.59
     lows
    0.59
     arrives
    0.59
    004
    0.59
    Tokens
    0.58
    Act Density 0.121%

    No Known Activations