INDEX
Explanations
adjectives and adverbs denoting degree or intensity
phrases that convey a sense of approximation or degree
New Auto-Interp
Negative Logits
DOM
-0.75
mage
-0.72
ILE
-0.66
umbnail
-0.66
rongh
-0.65
ipes
-0.62
ngth
-0.62
asc
-0.61
gemony
-0.60
runs
-0.60
POSITIVE LOGITS
unclear
0.88
ironic
0.85
obvious
0.84
impossible
0.84
raining
0.81
uphill
0.81
evident
0.80
interesting
0.79
easy
0.79
easier
0.77
Activations Density 0.319%