INDEX
    Explanations

    elements related to identity and personal history

    New Auto-Interp
    Negative Logits
    adam
    -0.16
    ÏĢή
    -0.16
    大åħ¨
    -0.14
    ãĤĵãģ©
    -0.14
     Fuse
    -0.14
    uco
    -0.14
    antas
    -0.14
    uest
    -0.13
     kå
    -0.13
    iaux
    -0.13
    POSITIVE LOGITS
     syn
    0.28
     rod
    0.25
     Rod
    0.23
     ÑĢод
    0.21
    Rod
    0.21
     brat
    0.20
    rod
    0.20
     Syn
    0.19
     adopt
    0.18
     rods
    0.18
    Act Density 0.024%

    No Known Activations