INDEX
    Explanations

    verbs related to actions and desires

    New Auto-Interp
    Negative Logits
    zew
    -0.15
     Dyn
    -0.15
    agnost
    -0.15
    rics
    -0.14
    å½ĵ
    -0.14
    edly
    -0.14
    θε
    -0.14
    vect
    -0.14
    ange
    -0.13
    quez
    -0.13
    POSITIVE LOGITS
     olan
    0.16
    /request
    0.15
    OA
    0.14
    OwnProperty
    0.14
    zm
    0.14
    AY
    0.13
    elter
    0.13
    åΰçļĦ
    0.13
    ched
    0.13
    eel
    0.13
    Act Density 0.261%

    No Known Activations