INDEX
    Explanations

    words related to sports and athletes

    New Auto-Interp
    Negative Logits
    olves
    -0.79
    adh
    -0.72
     outputs
    -0.67
    ¥µ
    -0.63
    versive
    -0.61
    animate
    -0.59
    nih
    -0.59
    ole
    -0.59
    appropriate
    -0.59
    feed
    -0.58
    POSITIVE LOGITS
     meanwhile
    1.24
     however
    1.13
     flanked
    1.07
    enegger
    0.99
     who
    0.98
     pictured
    0.96
     nicknamed
    0.94
     whose
    0.94
    whose
    0.92
    who
    0.92
    Act Density 0.122%

    No Known Activations