INDEX
Explanations
adverbs used to express certainty or opinion
words indicating levels of certainty or speculation
New Auto-Interp
Negative Logits
iage
-0.66
anners
-0.61
kefeller
-0.60
lain
-0.56
prus
-0.55
ategory
-0.55
itance
-0.55
\'
-0.55
UGE
-0.55
riers
-0.54
POSITIVE LOGITS
surprisingly
0.75
importantly
0.74
unconsciously
0.74
impro
0.73
anyway
0.73
unwittingly
0.73
guessed
0.72
asionally
0.71
ironically
0.71
,
0.70
Activations Density 0.119%