INDEX
    Explanations

    mentions of sports and athleticism

    New Auto-Interp
    Negative Logits
    enter
    -0.16
    anted
    -0.16
    hausen
    -0.16
    keiten
    -0.15
    AP
    -0.14
    nable
    -0.14
    bourne
    -0.14
    leen
    -0.14
    brick
    -0.14
    uite
    -0.14
    POSITIVE LOGITS
    ive
    0.39
    sw
    0.34
    ively
    0.29
    scar
    0.28
    ives
    0.26
    ivo
    0.26
    y
    0.26
    sp
    0.25
    ived
    0.25
    sc
    0.24
    Act Density 0.021%

    No Known Activations