INDEX
Explanations
phrases indicating predictions or future events
New Auto-Interp
Negative Logits
sometimes
-0.64
sometimes
-0.57
often
-0.55
有时
-0.55
后来
-0.54
Sometimes
-0.54
Wikimedijinoj
-0.53
often
-0.51
Sometimes
-0.50
иногда
-0.50
POSITIVE LOGITS
hopefully
1.28
Hopefully
1.22
Hopefully
1.21
hopefully
1.15
Fingers
1.09
Fingers
1.01
fingers
0.96
fingers
0.92
Expect
0.91
🤞
0.90
Activations Density 0.581%