INDEX
    Explanations

    instances of the imperative form of verbs

    New Auto-Interp
    Negative Logits
    ears
    -0.17
    heimer
    -0.15
    nection
    -0.14
    erial
    -0.14
    auth
    -0.13
    اث
    -0.13
    avors
    -0.13
    erah
    -0.13
    agher
    -0.13
    åįĩ
    -0.13
    POSITIVE LOGITS
    Modes
    0.16
    adera
    0.15
    anza
    0.15
    ile
    0.15
    yun
    0.15
    hs
    0.14
     sum
    0.14
    kke
    0.14
    adil
    0.14
    ys
    0.14
    Act Density 0.043%

    No Known Activations