INDEX
Explanations
concepts related to thought and introspection
New Auto-Interp
Negative Logits
FilterChain
-0.79
drücken
-0.73
o
-0.72
ers
-0.71
legungen
-0.71
-0.70
k
-0.70
dą
-0.68
5
-0.68
er
-0.68
POSITIVE LOGITS
Thought
1.44
THOUGHT
1.41
thought
1.38
Thought
1.32
thought
1.25
thoughts
1.05
thoughts
1.04
Thoughts
0.98
SOT
0.93
pensées
0.87
Activations Density 0.090%