INDEX
Explanations
adjectives emphasizing a moderate degree or level
instances of the word "moderate" and its variations
New Auto-Interp
Negative Logits
tyard
-0.77
kefeller
-0.73
ilage
-0.73
Chase
-0.71
arium
-0.70
inates
-0.65
ynthesis
-0.64
ledge
-0.64
stals
-0.64
tz
-0.63
POSITIVE LOGITS
erate
0.95
minded
0.91
xual
0.87
sized
0.85
(<
0.76
moderate
0.75
escal
0.67
er
0.66
minded
0.66
weights
0.64
Activations Density 0.041%