INDEX
Explanations
mentions of sports teams and events
New Auto-Interp
Negative Logits
ezvous
-0.69
brisk
-0.67
veyard
-0.61
Theft
-0.58
uncle
-0.57
ust
-0.57
escription
-0.57
illes
-0.56
astery
-0.56
delet
-0.56
POSITIVE LOGITS
alike
1.13
erning
0.93
who
0.87
ãĤ®
0.87
wanting
0.84
wishing
0.81
downstream
0.79
umers
0.79
enthusi
0.77
aspiring
0.76
Activations Density 1.956%