INDEX
Explanations
references to rugby players and their achievements
New Auto-Interp
Negative Logits
pleaſure
-0.72
ftate
-0.65
houſe
-0.64
ſtate
-0.63
צלחה
-0.63
atürk
-0.63
reafon
-0.63
ſame
-0.61
DeWitt
-0.59
faſt
-0.59
POSITIVE LOGITS
rugby
1.22
Rugby
1.09
Rugby
1.07
rugby
0.99
🏉
0.86
0.68
arXiv
0.63
IndentedString
0.63
quelize
0.58
msglen
0.57
Activations Density 0.021%