INDEX
    Explanations

    compound words and phrases that describe self-management and self-destructive behaviors

    New Auto-Interp
    Negative Logits
    utin
    -0.15
    amedi
    -0.15
    ank
    -0.14
    undry
    -0.14
     Van
    -0.14
     Jensen
    -0.13
    argent
    -0.13
     Mis
    -0.13
    asper
    -0.13
    oya
    -0.13
    POSITIVE LOGITS
    /self
    0.46
     Self
    0.35
    (Self
    0.32
     self
    0.30
    Self
    0.30
     SELF
    0.28
    -self
    0.27
    self
    0.26
    ,self
    0.26
    SELF
    0.24
    Act Density 0.036%

    No Known Activations