INDEX
Explanations
questions starting with "How"
questions that begin with "How."
New Auto-Interp
Negative Logits
goers
-0.66
ultimate
-0.64
ptions
-0.62
outer
-0.60
Feld
-0.58
hereafter
-0.57
article
-0.57
grounds
-0.57
piece
-0.56
room
-0.55
POSITIVE LOGITS
soever
1.14
ever
1.10
ells
1.04
beit
1.01
ling
0.94
itzer
0.91
dy
0.88
dare
0.82
leep
0.80
much
0.80
Activations Density 0.076%