INDEX
Explanations
instances of the word "think" and its variations, indicating a focus on thoughts and opinions
New Auto-Interp
Negative Logits
Jacobsen
-0.53
Maurer
-0.49
<bos>
-0.47
Quig
-0.46
Guill
-0.46
»
-0.43
Joel
-0.43
Jaffe
-0.43
«
-0.42
Jacobson
-0.42
POSITIVE LOGITS
Think
1.33
Think
1.32
THINK
1.30
think
1.27
think
1.25
THINK
1.21
thinks
1.16
Thinking
1.04
thinking
1.02
thought
0.98
Activations Density 0.080%