INDEX
    Explanations

    references to the Olympic Games

    references to the Olympics and related events

    New Auto-Interp
    Negative Logits
    lessly
    -0.84
    ...]
    -0.74
    cha
    -0.72
    edly
    -0.70
    erd
    -0.70
    othal
    -0.69
    lessness
    -0.68
    finding
    -0.68
    ppings
    -0.67
    ansas
    -0.66
    POSITIVE LOGITS
     Olympic
    0.98
     medal
    0.91
     Olympics
    0.89
     Games
    0.85
     athletes
    0.84
     Torch
    0.82
     athlete
    0.81
     Prize
    0.81
     Medal
    0.80
     gymn
    0.80
    Act Density 0.011%

    No Known Activations