INDEX
    Explanations

    specific instructions or prohibitions

    phrases containing the word "that" in various contexts, indicating a focus on specifying conditions or details

    New Auto-Interp
    Negative Logits
    ————
    -0.61
     whisk
    -0.60
    ointed
    -0.57
    apsed
    -0.57
    ahime
    -0.55
    eely
    -0.54
     etched
    -0.54
     realizing
    -0.54
    emon
    -0.54
    Hope
    -0.53
    POSITIVE LOGITS
     violates
    1.23
     exceeds
    1.21
     involves
    1.16
     contradicts
    1.14
     occurs
    1.08
     disagrees
    1.08
     qualifies
    1.06
     isn
    1.05
     doesn
    1.05
     satisfies
    1.01
    Act Density 0.156%

    No Known Activations