INDEX
Explanations
references to the Olympic Games
references to the Olympic Games
New Auto-Interp
Negative Logits
...]
-0.78
utils
-0.71
lessly
-0.70
cha
-0.69
hra
-0.68
drawn
-0.68
vironment
-0.67
finding
-0.64
BOOK
-0.62
flies
-0.61
POSITIVE LOGITS
Olympic
1.13
athletes
0.94
athlete
0.93
Athlet
0.91
gymn
0.87
IOC
0.86
Olympics
0.86
Torch
0.84
bledon
0.83
Olymp
0.78
Activations Density 0.005%