INDEX
    Explanations

    phrases related to reasons, motives, or causes

    questions and statements about reasons, importance, and the nature of situations

    New Auto-Interp
    Negative Logits
     sqor
    -0.66
    jri
    -0.66
    abase
    -0.64
    iewicz
    -0.63
     dismant
    -0.61
     fulfil
    -0.61
     premises
    -0.60
     feasibility
    -0.59
     simulac
    -0.58
     Ala
    -0.58
    POSITIVE LOGITS
    Reviewer
    0.92
    forth
    0.85
     bothering
    0.84
    ãĤ»
    0.83
     so
    0.79
     bother
    0.79
     reluctant
    0.77
     singled
    0.77
     persist
    0.77
     hesitant
    0.75
    Act Density 0.169%

    No Known Activations