INDEX
    Explanations

    phrases related to advising someone to take a specific action or make a decision

    New Auto-Interp
    Negative Logits
    ament
    -0.71
    creen
    -0.67
    ificent
    -0.66
    eers
    -0.65
    itionally
    -0.64
    ullah
    -0.64
    oret
    -0.60
     Horus
    -0.59
    icio
    -0.59
    ifully
    -0.59
    POSITIVE LOGITS
    vt
    1.12
    ggle
    1.05
    verning
    1.01
    lems
    1.01
     overboard
    0.95
    ALK
    0.91
     ahead
    0.86
     forth
    0.83
     unnoticed
    0.83
    ogly
    0.82
    Act Density 0.095%

    No Known Activations