INDEX
    Explanations

    words related to intentional and conscious actions

    New Auto-Interp
    Negative Logits
    sel
    -0.16
    urm
    -0.15
    eler
    -0.15
    ANGLES
    -0.15
    presso
    -0.14
    pector
    -0.14
    ixel
    -0.14
     cum
    -0.14
    arend
    -0.14
     sonst
    -0.14
    POSITIVE LOGITS
    mente
    0.17
    izia
    0.15
    ÑĪин
    0.15
    297
    0.15
    -mf
    0.15
    398
    0.14
    aways
    0.14
    evin
    0.14
    atio
    0.14
    281
    0.14
    Act Density 0.010%

    No Known Activations