INDEX
    Explanations

    phrases indicating imminent actions or events

    phrases indicating something that is about to happen or is imminent

    New Auto-Interp
    Negative Logits
    chens
    -0.73
     Forest
    -0.62
    Express
    -0.60
    glas
    -0.60
    hesis
    -0.60
    opoly
    -0.58
    aurus
    -0.57
    outs
    -0.56
     Writ
    -0.56
    Director
    -0.55
    POSITIVE LOGITS
     halfway
    0.95
    eatures
    0.72
    pheus
    0.70
    inyl
    0.70
     midway
    0.68
    aleb
    0.67
    ĻĤ
    0.64
    reaching
    0.63
     rall
    0.62
     nowhere
    0.62
    Act Density 0.086%

    No Known Activations