INDEX
    Explanations

    phrases emphasizing personal agency and responsibility in relationships

    New Auto-Interp
    Negative Logits
    mak
    -0.15
    udic
    -0.15
    Ëĺ
    -0.14
     Dawson
    -0.14
     effected
    -0.14
    473
    -0.14
    mps
    -0.14
    ापन
    -0.14
     bunu
    -0.13
    obe
    -0.13
    POSITIVE LOGITS
     do
    0.38
     does
    0.36
     did
    0.31
    does
    0.29
    do
    0.27
     Does
    0.26
    	do
    0.25
     Do
    0.24
    _do
    0.23
     DOES
    0.23
    Act Density 0.095%

    No Known Activations