INDEX
Explanations
questions ending with a question mark
questions that express confusion or seek clarification
New Auto-Interp
Negative Logits
apixel
-0.75
charism
-0.74
avorite
-0.73
respective
-0.73
apter
-0.72
uckland
-0.71
rament
-0.68
etheless
-0.66
ĸļ士
-0.66
alist
-0.66
POSITIVE LOGITS
Surely
1.25
Why
1.25
Anyway
1.23
Isn
1.20
Where
1.20
What
1.13
Didn
1.13
?!
1.13
Answer
1.13
Wouldn
1.12
Activations Density 0.115%