INDEX
Explanations
phrases emphasizing gratitude or positive recognition
New Auto-Interp
Negative Logits
therefore
-0.17
thus
-0.17
thus
-0.16
?v
-0.16
iek
-0.15
imat
-0.15
then
-0.14
consequently
-0.14
ackbar
-0.14
setStatus
-0.14
POSITIVE LOGITS
hope
0.33
Hope
0.25
hopes
0.24
Hope
0.24
hope
0.24
hoping
0.22
glad
0.20
Hopefully
0.19
Hopefully
0.18
hoped
0.18
Activations Density 0.061%