INDEX
    Explanations

    verbs related to taking action or making changes

    New Auto-Interp
    Negative Logits
    starter
    -0.63
    abet
    -0.62
    sa
    -0.61
    ija
    -0.61
    raltar
    -0.61
    spot
    -0.61
    den
    -0.60
    ahoo
    -0.60
    mad
    -0.59
     topped
    -0.59
    POSITIVE LOGITS
    ulate
    0.96
     them
    0.92
    orously
    0.88
    uate
    0.87
    ively
    0.86
     their
    0.85
    ibly
    0.84
     our
    0.81
     oneself
    0.81
     themselves
    0.78
    Act Density 3.061%

    No Known Activations