INDEX
    Explanations

    statements indicating action or decision-making

    words and phrases indicating direct statements or actions

    New Auto-Interp
    Negative Logits
    tis
    -0.77
    currently
    -0.76
    enei
    -0.74
    alg
    -0.70
    wayne
    -0.67
    arel
    -0.66
     ordinarily
    -0.65
    linked
    -0.62
    gae
    -0.61
    typically
    -0.61
    POSITIVE LOGITS
     wrong
    0.77
     inappropriately
    0.77
     Doct
    0.71
     LAST
    0.70
     mistakes
    0.69
     beforehand
    0.69
     last
    0.69
     yesterday
    0.67
    ocument
    0.65
    ãĤ¤ãĥĪ
    0.62
    Act Density 0.510%

    No Known Activations