INDEX
Explanations
adjectives and adverbs related to quantity and degree
New Auto-Interp
Negative Logits
utics
-0.95
amia
-0.82
attery
-0.80
Ear
-0.80
apons
-0.79
ilaterally
-0.78
itored
-0.77
ushima
-0.76
ensis
-0.75
their
-0.74
POSITIVE LOGITS
implication
1.15
assumption
1.14
question
1.12
downside
1.02
answer
1.00
takeaway
0.99
caveat
0.98
rationale
0.97
majority
0.97
impetus
0.96
Activations Density 0.343%