INDEX
Explanations
phrases related to specific events, states, or locations
sentences that conclude with a period
New Auto-Interp
Negative Logits
elbow
-0.68
ling
-0.65
monop
-0.65
neglig
-0.65
nodd
-0.65
lings
-0.64
footing
-0.64
raph
-0.63
coy
-0.63
optional
-0.63
POSITIVE LOGITS
Their
1.00
Hopefully
0.99
Specifically
0.98
Unfortunately
0.96
Thankfully
0.96
Sadly
0.94
Whereas
0.92
Along
0.92
Fortunately
0.91
Luckily
0.90
Activations Density 0.930%