INDEX
Explanations
sports scores in the format of team A's score followed by team B's score
scores or outcomes of sports events
New Auto-Interp
Negative Logits
swearing
-0.72
enegger
-0.69
igmat
-0.67
skelet
-0.61
trem
-0.60
descriptions
-0.58
pict
-0.58
tabl
-0.58
obe
-0.57
sway
-0.57
POSITIVE LOGITS
vs
1.00
ND
0.94
½
0.89
Aren
0.88
GG
0.85
LCS
0.83
DS
0.81
NP
0.80
nd
0.78
rd
0.78
Activations Density 0.063%