INDEX
Explanations
mentions of various types of "other" categories and miscellaneous items
New Auto-Interp
Negative Logits
uzzi
-0.16
ushi
-0.15
sst
-0.15
sted
-0.15
agem
-0.15
hare
-0.14
urf
-0.14
ãĥĮ
-0.14
leur
-0.13
ullah
-0.13
POSITIVE LOGITS
ellaneous
0.22
Weiss
0.17
Fleet
0.16
/misc
0.15
nth
0.15
idon
0.14
Wis
0.14
vise
0.14
WE
0.14
fleet
0.14
Activations Density 0.073%