INDEX
Explanations
references to the Los Angeles Dodgers baseball team
mentions of the Dodgers baseball team
New Auto-Interp
Negative Logits
lying
-0.88
uters
-0.85
rha
-0.73
itia
-0.71
awaru
-0.70
ugal
-0.70
essional
-0.68
rend
-0.66
ħĭ
-0.66
ta
-0.66
POSITIVE LOGITS
Dodgers
1.25
Padres
1.01
Doodle
0.90
Baseball
0.85
Chargers
0.83
outfielder
0.82
Stadium
0.80
reliever
0.80
Relief
0.78
Clippers
0.77
Activations Density 0.005%