INDEX
    Explanations

    phrases indicating the initiation or progression of actions and feelings

    New Auto-Interp
    Negative Logits
     continue
    -0.23
     continued
    -0.22
    continue
    -0.21
     continuing
    -0.21
     continuation
    -0.19
     continues
    -0.19
    	continue
    -0.19
    continued
    -0.18
     still
    -0.18
     continuar
    -0.17
    POSITIVE LOGITS
    ying
    0.21
     поба
    0.15
     notice
    0.15
     serious
    0.15
     noticed
    0.15
    845
    0.15
    lesh
    0.15
     taper
    0.15
     seriously
    0.15
     NOTICE
    0.15
    Act Density 0.035%

    No Known Activations