INDEX
Explanations
sports-related terms and player names
New Auto-Interp
Negative Logits
byss
-0.71
Ô
-0.70
Witcher
-0.69
shown
-0.65
ikuman
-0.63
Patient
-0.63
ð
-0.62
Tsukuyomi
-0.62
reconc
-0.61
è£ħ
-0.61
POSITIVE LOGITS
TING
1.05
tered
1.01
tering
1.00
ches
1.00
ted
0.99
ters
0.98
tle
0.95
wana
0.93
ting
0.92
hod
0.89
Activations Density 0.032%