INDEX
Explanations
occurrences of the word "qu" or related terms signifying questioning or uncertainty
New Auto-Interp
Negative Logits
o
-0.26
an
-0.19
oje
-0.18
ailles
-0.18
ugi
-0.17
oq
-0.17
andi
-0.16
i
-0.16
oÄį
-0.16
olet
-0.15
POSITIVE LOGITS
ench
0.25
izz
0.22
ies
0.22
irk
0.21
qu
0.20
able
0.20
eso
0.19
asic
0.19
esion
0.19
ashed
0.18
Activations Density 0.010%