INDEX
    Explanations

    concepts related to progress and forward movement

    New Auto-Interp
    Negative Logits
    bons
    -0.16
    ursive
    -0.16
    itur
    -0.15
    ivant
    -0.15
    iego
    -0.15
    .invalidate
    -0.15
    awah
    -0.14
    μεν
    -0.14
    ebek
    -0.14
    æĮ¯ãĤĬ
    -0.14
    POSITIVE LOGITS
     momentum
    0.17
    roc
    0.16
    596
    0.16
    595
    0.16
     toward
    0.16
    oku
    0.15
     towards
    0.15
    597
    0.14
    747
    0.14
    perator
    0.14
    Act Density 0.012%

    No Known Activations