INDEX
Explanations
instances of scoring and points in sports-related content
New Auto-Interp
Negative Logits
ÙĪØ¯
-0.17
LEC
-0.15
embers
-0.15
(always
-0.15
iferay
-0.15
loven
-0.15
lad
-0.15
lace
-0.14
sworth
-0.14
uded
-0.14
POSITIVE LOGITS
icl
0.15
yd
0.15
OOT
0.14
inand
0.14
ãĥ©ãĤ¤ãĥ³
0.14
ihn
0.14
urst
0.14
lek
0.14
iol
0.13
iene
0.13
Activations Density 0.010%