INDEX
    Explanations

    phrases indicating a requirement, prohibition, or consideration of opinions or actions

    New Auto-Interp
    Negative Logits
    cised
    -0.65
     Completed
    -0.64
     afore
    -0.63
     Casting
    -0.63
     Dise
    -0.62
    ipel
    -0.61
     Learning
    -0.61
    milo
    -0.60
     vanquished
    -0.58
     Semi
    -0.57
    POSITIVE LOGITS
    't
    1.43
    ned
    1.13
    ates
    0.92
    ning
    0.91
    atives
    0.90
    nell
    0.85
    nels
    0.84
    etsk
    0.83
    uts
    0.82
    kie
    0.81
    Act Density 0.112%

    No Known Activations