INDEX
Explanations
mentions of different sports
references to sports
New Auto-Interp
Negative Logits
pts
-0.72
idges
-0.69
ignt
-0.68
peror
-0.67
sidx
-0.66
ainers
-0.65
inki
-0.65
ceilings
-0.64
ipher
-0.64
enser
-0.63
POSITIVE LOGITS
nell
1.05
Sport
1.04
sport
0.98
sw
0.88
sports
0.85
enegger
0.83
iest
0.78
club
0.76
Sport
0.76
scar
0.75
Activations Density 0.006%