INDEX
    Explanations

    actions or processes that lead to an outcome, improvement, or change

    actions that signify change or effect in various contexts

    New Auto-Interp
    Negative Logits
    nex
    -0.70
    volent
    -0.65
    conn
    -0.63
    zo
    -0.63
    street
    -0.59
    shore
    -0.58
     Antar
    -0.58
    SW
    -0.58
    sw
    -0.58
    squ
    -0.57
    POSITIVE LOGITS
    ettings
    0.99
    hift
    0.95
    paces
    0.95
    ometimes
    0.91
    hement
    0.79
    hirt
    0.79
    ilver
    0.76
    creen
    0.74
    pace
    0.74
    heet
    0.71
    Act Density 0.462%

    No Known Activations