INDEX
Explanations
questions ending with a question mark
questions that prompt inquiry or exploration
New Auto-Interp
Negative Logits
belt
-0.75
ishable
-0.68
mia
-0.68
lam
-0.67
haul
-0.67
mot
-0.65
booked
-0.65
nar
-0.64
bole
-0.64
lining
-0.63
POSITIVE LOGITS
Nope
1.08
Lastly
1.04
Answer
0.99
Flavoring
0.99
Et
0.97
Finally
0.96
Whatever
0.95
Inqu
0.95
Conversely
0.94
Perhaps
0.93
Activations Density 0.054%