INDEX
    Explanations

    phrases related to evaluation or assessment

    phrases indicating the existence or importance of certain ideas or actions

    New Auto-Interp
    Negative Logits
    ãĤ´ãĥ³
    -0.84
    iren
    -0.82
    aired
    -0.81
    urred
    -0.76
    aughtered
    -0.75
    retched
    -0.74
    izens
    -0.74
    pered
    -0.73
    affiliated
    -0.72
    rup
    -0.69
    POSITIVE LOGITS
     gonna
    0.98
     figuring
    0.97
     trying
    0.86
     getting
    0.85
     how
    0.81
     consistency
    0.81
     educating
    0.80
     putting
    0.80
     thanking
    0.79
     respecting
    0.79
    Act Density 0.265%

    No Known Activations