INDEX
    Explanations

    phrases related to commands or instructions

    phrases that indicate warnings or prohibitions

    New Auto-Interp
    Negative Logits
     effic
    -0.62
     basics
    -0.60
     exemplary
    -0.57
     wonderfully
    -0.56
    rocal
    -0.56
     admirable
    -0.55
     excellent
    -0.55
     awesome
    -0.54
     unparalleled
    -0.54
     amazing
    -0.54
    POSITIVE LOGITS
     anymore
    1.88
     unless
    1.79
    unless
    1.66
     nor
    1.41
     lest
    1.36
     until
    1.33
    until
    1.33
     because
    1.28
     anytime
    1.26
     whatsoever
    1.21
    Act Density 0.738%

    No Known Activations