INDEX
Explanations
user queries that directly address the assistant with second-person phrasing (especially “you”), often in “Do/Can you …?” requests.
New Auto-Interp
Negative Logits
俳
-0.08
Polygon
-0.07
loại
-0.07
葎
-0.07
additional
-0.07
וכמובן
-0.07
.itemId
-0.07
Scandin
-0.07
硪
-0.07
словам
-0.07
POSITIVE LOGITS
ุ
0.07
حة
0.07
créé
0.07
時点
0.07
口号
0.07
诊治
0.07
by
0.07
birth
0.06
bất
0.06
Reads
0.06
Activations Density 0.041%