INDEX
    Explanations

    phrases indicating emotional or physical struggles and conflicts

    New Auto-Interp
    Negative Logits
    793
    -0.16
    aight
    -0.16
    alley
    -0.15
    à¸Ńà¸ļ
    -0.14
    .managed
    -0.14
    792
    -0.14
    taire
    -0.14
    hower
    -0.14
    ushman
    -0.14
    819
    -0.14
    POSITIVE LOGITS
    atern
    0.16
    Wo
    0.15
    URA
    0.15
    ayout
    0.14
    ÅŁt
    0.14
    ula
    0.14
     military
    0.14
    ?=.*
    0.14
    @js
    0.14
    endo
    0.14
    Act Density 0.015%

    No Known Activations