INDEX
    Explanations

    elements related to conflict and action

    New Auto-Interp
    Negative Logits
    irie
    -0.18
    hind
    -0.16
    avou
    -0.16
     Hind
    -0.16
    .nano
    -0.15
    zew
    -0.15
    VERR
    -0.15
    393
    -0.15
    ophon
    -0.14
    हन
    -0.14
    POSITIVE LOGITS
    olean
    0.15
    lette
    0.15
    cx
    0.15
    äºķ
    0.15
     withhold
    0.14
    igit
    0.14
    lexical
    0.14
    à¸ĺรรม
    0.14
    umbled
    0.14
    벨
    0.14
    Act Density 0.459%

    No Known Activations