INDEX
Explanations
references to boxing and competitive sports
New Auto-Interp
Negative Logits
infeld
-0.16
çİ©
-0.15
engers
-0.15
fur
-0.15
ızı
-0.14
تدÙī
-0.14
à¥Ĥà¤ļ
-0.14
runner
-0.14
ophon
-0.14
arov
-0.13
POSITIVE LOGITS
boxing
0.41
fighters
0.39
fight
0.38
box
0.36
boxing
0.36
Boxing
0.35
Fight
0.35
fighting
0.34
fights
0.34
BOX
0.34
Activations Density 0.167%