INDEX
Explanations
instances of polite requests or inquiries in conversation
New Auto-Interp
Negative Logits
gaard
-0.17
å±ħæ°ij
-0.16
avar
-0.15
oque
-0.15
lu
-0.14
afa
-0.14
ynes
-0.14
ubs
-0.14
öm
-0.14
bureaucr
-0.14
POSITIVE LOGITS
çͳ
0.16
YRO
0.16
rana
0.16
erea
0.15
ROTO
0.14
rang
0.14
aptcha
0.14
виÑĩай
0.13
agers
0.13
AGER
0.13
Activations Density 0.056%