INDEX
Explanations
references to sports, especially basketball
New Auto-Interp
Negative Logits
vention
-0.71
eers
-0.71
Mith
-0.65
sters
-0.64
Lei
-0.62
quo
-0.62
capped
-0.61
Enforcement
-0.61
manship
-0.60
Ars
-0.58
POSITIVE LOGITS
anny
1.20
umpy
1.20
udge
1.17
iffin
1.15
ands
1.06
acies
1.05
illing
1.04
iff
1.03
itty
1.03
itt
1.02
Activations Density 0.538%