INDEX
    Explanations

    mention of family members

    conjunctions and repeated phrases emphasizing connection and continuity

    New Auto-Interp
    Negative Logits
    oward
    -0.93
    prise
    -0.76
    itatively
    -0.75
    ggles
    -0.75
    utt
    -0.73
    ruce
    -0.72
    ucc
    -0.72
    ulent
    -0.70
    uts
    -0.70
    resh
    -0.69
    POSITIVE LOGITS
     therefore
    1.36
     hence
    1.17
     thus
    1.17
     consequently
    1.07
     cannot
    0.97
     nobody
    0.90
     incapable
    0.90
     prone
    0.88
     secondly
    0.86
     enjoys
    0.85
    Act Density 0.289%

    No Known Activations