INDEX
Explanations
questions or statements ending in a question mark
rhetorical questions and expressions of uncertainty
New Auto-Interp
Negative Logits
anwhile
-0.71
escription
-0.64
arde
-0.63
laun
-0.63
İĭ
-0.62
luence
-0.61
ronics
-0.61
CRIP
-0.61
Skydragon
-0.60
monton
-0.60
POSITIVE LOGITS
yes
1.98
yes
1.97
Yes
1.90
Yes
1.89
YES
1.88
YES
1.83
Nope
1.75
Absolutely
1.73
Absolutely
1.60
No
1.47
Activations Density 0.302%