INDEX
Explanations
references to a specific sport, specifically tennis
references to the sport of tennis
New Auto-Interp
Negative Logits
utter
-0.79
Burnett
-0.71
ibus
-0.67
ibal
-0.66
usting
-0.66
usions
-0.62
Citizens
-0.62
plur
-0.61
phabet
-0.61
Kurd
-0.60
POSITIVE LOGITS
tennis
1.19
bledon
1.17
Tennis
1.07
volleyball
1.02
nas
0.88
bowl
0.87
ercise
0.84
balls
0.84
darts
0.80
chess
0.80
Activations Density 0.008%