INDEX
Explanations
rhetorical questions expressing curiosity or surprise
New Auto-Interp
Negative Logits
paren
-0.17
emer
-0.16
lator
-0.15
mant
-0.15
247
-0.15
umbo
-0.14
ole
-0.14
-0.14
ums
-0.14
onBind
-0.14
POSITIVE LOGITS
Wass
0.16
iked
0.16
Desert
0.15
desert
0.14
GPC
0.14
ruk
0.14
kate
0.14
záp
0.14
tieten
0.14
scor
0.14
Activations Density 0.013%