INDEX
Explanations
negative phrases or expressions of doubt
Negation before "surprise" or similar words
not surprising / no surprise
New Auto-Interp
Negative Logits
للمعارف
-0.67
mobileqq
-0.65
Tikang
-0.63
initComponents
-0.62
!*\
-0.60
HtmlAttribute
-0.59
autorytatywna
-0.59
__':
-0.58
اشتی
-0.57
writeValue
-0.57
POSITIVE LOGITS
surprise
1.45
surprise
1.17
surprised
1.10
Surprise
1.07
surprises
1.06
surprising
1.05
shock
1.02
wonder
1.02
surpris
1.00
Surprise
1.00
Activations Density 0.171%