INDEX
Explanations
phrases that indicate curiosity or questioning
expressions of curiosity or questioning
New Auto-Interp
Negative Logits
idelines
-0.64
Guidelines
-0.62
IDs
-0.61
HF
-0.61
Low
-0.60
Aff
-0.60
BA
-0.60
stra
-0.59
ICO
-0.58
Locke
-0.57
POSITIVE LOGITS
wonder
4.09
wonders
2.45
marvel
1.94
wondered
1.87
wondering
1.73
Wonders
1.51
doubt
1.39
amaz
1.38
aston
1.33
pity
1.33
Activations Density 0.011%