INDEX
Explanations
questions at the end of sentences
questions or inquiries, particularly those seeking clarification or understanding
New Auto-Interp
Negative Logits
athe
-0.76
ema
-0.72
evening
-0.72
background
-0.70
cipled
-0.66
aku
-0.66
swamp
-0.66
yak
-0.66
flagged
-0.66
guiActiveUnfocused
-0.65
POSITIVE LOGITS
Well
1.22
Turns
1.05
Probably
1.05
Surely
1.05
Aren
1.04
Firstly
1.02
Perhaps
1.02
Why
1.00
Answer
0.99
Certainly
0.98
Activations Density 0.093%