INDEX
Explanations
phrases introducing speculation or possibilities
the word "perhaps," indicating uncertainty or speculation
New Auto-Interp
Negative Logits
akes
-0.80
cium
-0.79
endar
-0.76
eral
-0.75
iphate
-0.74
say
-0.73
eph
-0.72
gem
-0.72
ature
-0.72
ship
-0.71
POSITIVE LOGITS
unsurprisingly
1.47
sensing
1.04
unsur
1.03
surprisingly
1.02
reflecting
0.96
predictably
0.93
ironically
0.92
understandably
0.89
unwittingly
0.87
embold
0.86
Activations Density 0.049%