INDEX
    Explanations

    instances of the verb "go" and its variations

    New Auto-Interp
    Negative Logits
    idis
    -0.18
    inz
    -0.16
    athon
    -0.15
     tehdy
    -0.15
    uh
    -0.15
    GAN
    -0.14
    vil
    -0.14
    riel
    -0.14
    -dismiss
    -0.14
    ahren
    -0.14
    POSITIVE LOGITS
     with
    0.27
     ahead
    0.20
     for
    0.18
     old
    0.17
     avec
    0.16
     ult
    0.16
    586
    0.15
    old
    0.15
    agon
    0.15
    .with
    0.14
    Act Density 0.047%

    No Known Activations