INDEX
Explanations
comparative adjectives expressing difficulty
references to difficulty or challenges
New Auto-Interp
Negative Logits
Kings
-0.72
enture
-0.72
endar
-0.71
itas
-0.69
atern
-0.68
Lights
-0.66
ificantly
-0.65
erity
-0.65
mosp
-0.64
ript
-0.64
POSITIVE LOGITS
coded
0.79
punishable
0.79
punished
0.77
forgiving
0.75
ACH
0.72
harder
0.71
slog
0.69
wired
0.68
hitters
0.67
prosecuted
0.67
Activations Density 0.051%