INDEX
    Explanations

    trigger words indicating a sequential event or action

    instances of the word "Following."

    New Auto-Interp
    Negative Logits
    imm
    -0.76
    adle
    -0.72
    immer
    -0.70
    inese
    -0.67
    agin
    -0.66
    access
    -0.66
    cci
    -0.65
    eri
    -0.65
    vere
    -0.63
    elo
    -0.63
    POSITIVE LOGITS
    noon
    0.81
    Īè
    0.73
     follows
    0.71
    SourceFile
    0.71
    teen
    0.69
     Following
    0.68
    Following
    0.65
    Sym
    0.65
     Steps
    0.65
    >:
    0.64
    Act Density 0.012%

    No Known Activations