INDEX
Explanations
tokens that mark the assistant/response header or conversation boundary (assistant role/header delimiter tokens).
New Auto-Interp
Negative Logits
стор
-0.07
سانی
-0.06
zor
-0.06
[o
-0.06
moy
-0.06
ตร
-0.06
Faction
-0.06
Delete
-0.06
logout
-0.06
tim
-0.06
POSITIVE LOGITS
gorgeous
0.07
complain
0.07
apologized
0.07
fails
0.07
blog
0.07
Sorry
0.07
erialized
0.06
BufferSize
0.06
Drill
0.06
해외
0.06
Activations Density 0.051%