INDEX
    Explanations

    the word "up" in various contexts, indicating a focus on upward movement or positivity

    New Auto-Interp
    Negative Logits
    t
    -0.23
    ings
    -0.18
    ildren
    -0.16
    undler
    -0.16
    tres
    -0.16
    awai
    -0.16
    amik
    -0.16
    ureau
    -0.15
    ectl
    -0.15
    esson
    -0.15
    POSITIVE LOGITS
    root
    0.33
    holding
    0.33
    ping
    0.32
    sets
    0.31
    state
    0.30
    river
    0.30
    ped
    0.29
    turned
    0.28
    ended
    0.28
     front
    0.28
    Act Density 0.046%

    No Known Activations