INDEX
    Explanations

    references to various forms and contexts of therapy

    New Auto-Interp
    Negative Logits
       
    -0.18
    aji
    -0.15
    shit
    -0.15
    że
    -0.15
    arel
    -0.15
    tip
    -0.14
    erness
    -0.14
    eniable
    -0.14
    s
    -0.14
    asses
    -0.14
    POSITIVE LOGITS
    iltr
    0.17
    isted
    0.17
     Shaw
    0.16
    avicon
    0.15
    ically
    0.15
    apeutic
    0.15
    atically
    0.15
    ensible
    0.14
    fully
    0.14
    ooter
    0.14
    Act Density 0.022%

    No Known Activations