INDEX
Explanations
references to the Olympic Games
references to the Olympics and related events
New Auto-Interp
Negative Logits
lessly
-0.79
erd
-0.73
ogyn
-0.73
...]
-0.72
ologies
-0.71
utils
-0.71
cha
-0.70
arial
-0.70
edly
-0.69
othal
-0.68
POSITIVE LOGITS
Olympic
0.97
medal
0.92
Olympics
0.87
athletes
0.85
Games
0.84
athlete
0.83
Torch
0.83
medals
0.83
Prize
0.81
Medal
0.79
Activations Density 0.012%