INDEX
    Explanations

    words indicating actions and states of being

    New Auto-Interp
    Negative Logits
    akin
    -0.16
    мÑĸнÑĸ
    -0.15
    /***/
    -0.15
    recated
    -0.14
    StringRef
    -0.14
    exas
    -0.14
    eling
    -0.14
    uvwxyz
    -0.14
    iq
    -0.14
    bane
    -0.14
    POSITIVE LOGITS
     
    0.17
     grad
    0.16
    387
    0.15
     Ki
    0.15
     braz
    0.15
    zin
    0.15
     Sher
    0.15
     ram
    0.14
    rms
    0.14
    yst
    0.14
    Act Density 0.023%

    No Known Activations