INDEX
Explanations
markers that denote the beginning of a speaker/header section in the conversation formatting (conversation boundary/header delimiters).
New Auto-Interp
Negative Logits
323
-0.08
&
-0.07
فراهم
-0.07
-IS
-0.07
aun
-0.07
Zust
-0.07
便
-0.06
_arrow
-0.06
424
-0.06
真
-0.06
POSITIVE LOGITS
../../../
0.07
čka
0.07
�
0.07
_configs
0.07
化学
0.06
лоб
0.06
#'
0.06
interfaces
0.06
گو
0.06
(Role
0.06
Activations Density 0.242%