INDEX
    Explanations

    concepts related to self-sufficiency and independence

    New Auto-Interp
    Negative Logits
    ewise
    -0.16
     odst
    -0.15
     Weird
    -0.15
    ALSE
    -0.14
    uml
    -0.14
    wort
    -0.14
    arks
    -0.14
    anger
    -0.14
    oq
    -0.14
     заÑīиÑĤÑĭ
    -0.14
    POSITIVE LOGITS
    /self
    0.25
     self
    0.23
     independently
    0.19
    èĩª
    0.18
     Self
    0.18
    Self
    0.17
    -self
    0.17
     SELF
    0.17
    self
    0.17
    (Self
    0.16
    Act Density 0.158%

    No Known Activations