INDEX
    Explanations

    words related to following instructions or actions

    instances of the word "follow"

    New Auto-Interp
    Negative Logits
    pite
    -0.73
    orc
    -0.69
    ukemia
    -0.67
     wounding
    -0.67
    aucas
    -0.67
    inese
    -0.66
    ldom
    -0.65
    intendent
    -0.64
    rimination
    -0.64
    Extreme
    -0.63
    POSITIVE LOGITS
    follow
    0.91
     follow
    0.85
     Follow
    0.83
     follows
    0.82
    LLOW
    0.78
    ĸļ
    0.76
     suit
    0.72
     faithfully
    0.71
    SHIP
    0.70
    ansen
    0.68
    Act Density 0.025%

    No Known Activations