INDEX
Explanations
conversational transition words
direct, casual conversational cues—especially second-person address and short social/emotional prompts in user queries.
New Auto-Interp
Negative Logits
приве
0.26
ಖ
0.25
வைர
0.25
prolific
0.25
miR
0.24
الإ
0.24
Threat
0.24
пла
0.24
Lim
0.24
desider
0.24
POSITIVE LOGITS
lakini
0.33
nhưng
0.33
ngunit
0.31
pero
0.31
mutta
0.31
сегодня
0.30
但是
0.30
কিন্তু
0.30
ьогодні
0.30
?/
0.30
Activations Density 0.336%