INDEX
Explanations
queries or questions ending with a question mark
interrogative sentences or questions
New Auto-Interp
Negative Logits
encount
-0.81
eping
-0.71
cipled
-0.69
aled
-0.66
blob
-0.66
aku
-0.65
battered
-0.64
aler
-0.64
aper
-0.63
neau
-0.63
POSITIVE LOGITS
Where
1.02
Surely
1.02
.?
1.01
Somebody
0.99
Why
0.98
Certainly
0.97
Anyway
0.94
Probably
0.94
[*
0.94
What
0.92
Activations Density 0.091%