INDEX
Explanations
names of sports teams and cities
references to sports teams and related events
New Auto-Interp
Negative Logits
goddamn
-0.60
.)
-0.58
fucking
-0.57
.–
-0.52
answ
-0.51
"+
-0.51
accordingly
-0.51
twelve
-0.50
Mandatory
-0.50
fuckin
-0.50
POSITIVE LOGITS
etc
1.03
and
0.97
elsius
0.74
etc
0.73
ect
0.72
ADRA
0.65
or
0.62
merce
0.61
ibia
0.59
berra
0.57
Activations Density 0.473%