INDEX
    Explanations

    phrases related to providing insight or a brief preview of future events

    references to glimpses or previews of information or concepts

    New Auto-Interp
    Negative Logits
    hement
    -0.75
    ients
    -0.71
    KK
    -0.70
    cial
    -0.69
    ubs
    -0.67
    die
    -0.66
    lees
    -0.66
    arently
    -0.64
    eches
    -0.64
     depended
    -0.63
    POSITIVE LOGITS
     whats
    0.91
     sorts
    0.83
    helm
    0.77
     theirs
    0.73
     ours
    0.73
     doom
    0.71
     what
    0.69
     physiology
    0.69
     reality
    0.68
     how
    0.66
    Act Density 0.187%

    No Known Activations