INDEX
Explanations
Chinese words or characters
specific special characters or symbols in text
New Auto-Interp
Negative Logits
ufact
-0.92
eger
-0.72
phia
-0.69
doms
-0.68
eah
-0.65
goats
-0.65
ourt
-0.64
imore
-0.64
sembly
-0.63
ensical
-0.61
POSITIVE LOGITS
CHAT
0.79
à¼
0.75
ãĥĢ
0.75
ãĤĬ
0.72
×Ļ
0.69
ª
0.66
isoft
0.66
tain
0.64
ËĪ
0.63
Slot
0.63
Activations Density 0.157%