INDEX
    Explanations

    phrases indicating reversal or contrast, particularly when describing situations or states that are opposite to expectations

    New Auto-Interp
    Negative Logits
    wright
    -0.16
    ugh
    -0.15
    rieve
    -0.15
    fell
    -0.15
     Stokes
    -0.14
    enn
    -0.14
    plex
    -0.14
     Arb
    -0.14
     Arbor
    -0.14
    vida
    -0.14
    POSITIVE LOGITS
    ucha
    0.16
     Wort
    0.15
    jde
    0.15
    ews
    0.14
    erule
    0.14
    arehouse
    0.14
     Niet
    0.14
    ÑĨаÑĤÑĮ
    0.13
    Rule
    0.13
    ZW
    0.13
    Act Density 0.009%

    No Known Activations