INDEX
    Explanations

    verbs related to actions or events

    New Auto-Interp
    Negative Logits
    edReader
    -0.17
    ably
    -0.17
    arem
    -0.15
    ालà¤ķ
    -0.15
    bound
    -0.14
    igated
    -0.14
    edException
    -0.14
    à¸ĵ
    -0.14
    aeper
    -0.14
    cci
    -0.14
    POSITIVE LOGITS
    umm
    0.22
    eck
    0.21
    ãĤĬ
    0.20
    ull
    0.20
    ood
    0.20
    own
    0.20
    ore
    0.20
    ust
    0.19
    ail
    0.18
    ron
    0.18
    Act Density 0.021%

    No Known Activations