INDEX
    Explanations

    references to interpersonal relationships and emotional turmoil

    New Auto-Interp
    Negative Logits
    kees
    -0.14
     pozdÄĽ
    -0.13
     Canter
    -0.12
    stell
    -0.12
    ä¸ĭåİ»
    -0.12
    zel
    -0.12
    bac
    -0.12
     someday
    -0.12
    cid
    -0.12
    ãĥ¯ãĥ¼
    -0.12
    POSITIVE LOGITS
     before
    1.20
    before
    1.05
     antes
    0.95
     Before
    0.93
    Before
    0.91
     BEFORE
    0.89
    _before
    0.87
    -before
    0.86
    .before
    0.82
    	before
    0.81
    Act Density 1.472%

    No Known Activations