INDEX
    Explanations

    concepts related to self-identity and awareness

    New Auto-Interp
    Negative Logits
     themselves
    -0.83
     ourselves
    -0.72
     himself
    -0.66
     yourself
    -0.66
     itself
    -0.65
     myself
    -0.59
     herself
    -0.58
    PMID
    -0.58
    OGND
    -0.58
     RSSSF
    -0.58
    POSITIVE LOGITS
    standig
    0.67
    ändig
    0.65
    SELF
    0.63
    s
    0.62
    hood
    0.59
    ly
    0.58
    ständig
    0.56
    帖最后由
    0.55
    ishly
    0.54
    IInterface
    0.54
    Act Density 0.143%

    No Known Activations