INDEX
Explanations
verbs expressing obligation, advice, or expectation
New Auto-Interp
Negative Logits
Fra
-0.77
Wid
-0.71
Hilbert
-0.67
Wolfgang
-0.61
maze
-0.59
Afgh
-0.59
Kah
-0.59
FW
-0.58
Cir
-0.58
WI
-0.58
POSITIVE LOGITS
beware
1.03
ered
1.01
be
0.90
ideally
0.88
nt
0.87
n
0.85
strive
0.84
aspire
0.83
reconsider
0.83
ering
0.82
Activations Density 3.187%