INDEX
Explanations
phrases that express certainty and surprise regarding various contexts
New Auto-Interp
Negative Logits
zzle
-0.16
νÏİ
-0.15
isses
-0.15
оÑĢÑĤÑĥ
-0.15
undle
-0.15
åį
-0.15
izik
-0.14
addock
-0.14
owitz
-0.14
ÑģÑĤи
-0.14
POSITIVE LOGITS
wonder
0.56
Wonder
0.40
wondered
0.33
surprise
0.31
Wonder
0.31
wonders
0.30
wondering
0.30
unsur
0.30
sur
0.26
onder
0.26
Activations Density 0.099%