INDEX
Explanations
the possessive pronoun "his."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
148
+0.13
0.7%
218
+0.13
0.7%
162
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
96
+0.13
0.11
120
+0.13
0.09
191
+0.11
0.09
Negative Logits
ène
-1.67
·¸
-1.66
"_
-1.39
;">
-1.37
;_
-1.37
_"
-1.35
>"
-1.34
assa
-1.34
="#
-1.31
inst
-1.31
POSITIVE LOGITS
([**
1.61
Maryland
1.50
gov
1.49
\.
1.44
pandemic
1.38
Violence
1.37
ilities
1.32
Iran
1.32
Baltimore
1.30
COVID
1.25
Activations Density 0.145%