INDEX
    Explanations

    phrases related to giving instructions or directives

    phrases that instruct or prompt actions

    New Auto-Interp
    Negative Logits
     defe
    -0.61
    ean
    -0.61
     lik
    -0.58
     prophes
    -0.58
    NESS
    -0.57
    iege
    -0.55
     Lich
    -0.54
    ikawa
    -0.54
    ridge
    -0.53
    ode
    -0.52
    POSITIVE LOGITS
     rid
    1.14
    TING
    1.11
    away
    0.96
    aways
    0.89
     acquainted
    0.79
    tin
    0.78
    cloneembedreportprint
    0.78
    ãĥ³ãĤ¸
    0.78
     Started
    0.74
     Away
    0.72
    Act Density 0.081%

    No Known Activations