INDEX
Explanations
adjectives implying humility or simplicity
expressions of the word "modest" indicating varying degrees of smallness or restraint
New Auto-Interp
Negative Logits
emis
-0.70
oldown
-0.69
AW
-0.69
ACTION
-0.69
SW
-0.64
AMS
-0.63
Aval
-0.62
onz
-0.61
izzard
-0.60
avez
-0.60
POSITIVE LOGITS
ly
1.37
beginnings
0.93
sized
0.91
consolation
0.90
(<
0.88
LY
0.87
ally
0.87
minded
0.78
lys
0.77
amounts
0.77
Activations Density 0.026%