INDEX
Explanations
model-generated, structured technical output—especially code/markup blocks and chat/turn markers—rather than ordinary user prose.
New Auto-Interp
Negative Logits
icyclo
0.44
ambilan
0.44
荸
0.43
ост
0.43
້ອງ
0.41
mirea
0.41
зага
0.41
:[/
0.41
вроде
0.40
امیدوار
0.40
POSITIVE LOGITS
AUT
0.46
GRAY
0.43
REL
0.41
FUR
0.41
benc
0.40
LEV
0.40
XX
0.39
information
0.39
Esc
0.39
RES
0.38
Activations Density 1.958%