INDEX
    Explanations

    imperative verbs related to encouragement or action

    New Auto-Interp
    Negative Logits
    ìĬµ
    -0.08
    è·¡
    -0.07
    gate
    -0.07
    byss
    -0.07
    ì§ij
    -0.06
    tingham
    -0.06
    came
    -0.06
    yster
    -0.06
    gard
    -0.06
    ưỡng
    -0.06
    POSITIVE LOGITS
     ahead
    0.14
     Ahead
    0.11
    ahead
    0.11
     figure
    0.10
    Ahead
    0.09
     Go
    0.09
    Go
    0.09
     go
    0.08
    -go
    0.08
    åIJ§
    0.08
    Act Density 0.008%

    No Known Activations