INDEX
Explanations
the word "think" and its variations in different contexts
forms of 'think'
New Auto-Interp
Negative Logits
<bos>
-0.59
Guill
-0.57
Jacobsen
-0.55
Maurer
-0.54
Jaffe
-0.52
Jacobson
-0.48
SAA
-0.47
Joel
-0.47
CCL
-0.46
Mons
-0.46
POSITIVE LOGITS
THINK
1.22
Think
1.22
Think
1.19
think
1.18
think
1.18
THINK
1.09
thinks
1.08
thought
0.89
Thinking
0.88
thinking
0.87
Activations Density 0.030%