INDEX
    Explanations

    phrases related to expressing opinions or making decisions

    phrases indicating the need for action or change

    New Auto-Interp
    Negative Logits
    ady
    -0.73
    izens
    -0.65
    izen
    -0.63
    emaker
    -0.61
    eny
    -0.61
    afe
    -0.59
    owe
    -0.59
    raft
    -0.58
    ãĤ´ãĥ³
    -0.57
    ower
    -0.57
    POSITIVE LOGITS
     yeah
    1.06
     huh
    0.97
     namely
    0.92
     etc
    0.89
     sir
    0.86
     blah
    0.86
     â̦"
    0.85
     whereas
    0.85
     maybe
    0.83
     [
    0.82
    Act Density 0.447%

    No Known Activations