INDEX
Explanations
analysis
The main thing this neuron does is detect mentions of commentary or analysis (words like “analysis,” “insights,” “thoughts,” etc.).
New Auto-Interp
Negative Logits
astics
-0.07
Alright
-0.06
<dynamic
-0.06
backtrack
-0.06
fseek
-0.06
mpi
-0.06
darkness
-0.06
granularity
-0.06
['./
-0.06
elop
-0.06
POSITIVE LOGITS
analysis
0.08
analyst
0.07
insightful
0.07
Analysis
0.07
essays
0.06
relating
0.06
comment
0.06
insight
0.06
over
0.06
.Wh
0.06
Activations Density 0.023%