INDEX
Explanations
adjectives related to seriousness
instances of the word "serious."
New Auto-Interp
Negative Logits
atu
-0.87
wright
-0.85
enaries
-0.85
eez
-0.75
Ĥİ
-0.72
ucky
-0.71
av
-0.71
ifully
-0.70
orious
-0.70
via
-0.68
POSITIVE LOGITS
lly
0.95
consideration
0.87
serious
0.84
contender
0.81
serious
0.80
enough
0.77
nces
0.74
understatement
0.74
dent
0.73
gn
0.72
Activations Density 0.031%