INDEX
    Explanations

    terms related to martial arts, specifically those associated with Kung Fu

    New Auto-Interp
    Negative Logits
    hra
    -0.16
    hn
    -0.16
    chn
    -0.15
    iti
    -0.15
    vable
    -0.15
    ξη
    -0.15
    izado
    -0.15
    etz
    -0.15
    hong
    -0.14
    ethe
    -0.14
    POSITIVE LOGITS
    sten
    0.19
    lasses
    0.19
    su
    0.18
    arian
    0.16
    uestion
    0.16
    flen
    0.16
    lish
    0.16
    aroo
    0.15
    tol
    0.15
    kas
    0.15
    Act Density 0.007%

    No Known Activations