INDEX
    Explanations

    phrases related to actions or decisions

    instances of the word "taken."

    New Auto-Interp
    Negative Logits
    ulate
    -0.62
    raft
    -0.61
    liction
    -0.58
    rose
    -0.58
    lin
    -0.56
    rous
    -0.56
    ense
    -0.56
    lier
    -0.56
    saw
    -0.56
    ove
    -0.55
    POSITIVE LOGITS
     taken
    3.37
     Taken
    2.20
     eaten
    1.60
     flown
    1.57
     undertaken
    1.52
     gone
    1.37
     borne
    1.35
     seized
    1.35
     done
    1.30
     thrown
    1.30
    Act Density 0.029%

    No Known Activations