INDEX
    Explanations

    verbs expressing likelihood or preference

    phrases that indicate tendencies or patterns in behavior

    New Auto-Interp
    Negative Logits
    yet
    -0.81
    hello
    -0.74
    bats
    -0.73
    teen
    -0.67
    raq
    -0.66
    ATS
    -0.66
    Ready
    -0.65
    spection
    -0.64
    Valid
    -0.64
    lights
    -0.63
    POSITIVE LOGITS
     prioritize
    1.32
     underestimate
    1.27
     behave
    1.22
     concentrate
    1.20
     specialize
    1.20
     emphasize
    1.19
     prefer
    1.18
     accumulate
    1.17
     overest
    1.16
     rely
    1.16
    Act Density 0.100%

    No Known Activations