INDEX
Explanations
references to "car" and its variations in context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.18
1.0%
410
+0.13
0.7%
443
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
113
+0.18
0.03
443
+0.13
0.02
194
+0.11
0.02
Negative Logits
ĥ½
-2.49
Ĥ¬
-2.28
£
-2.09
ĥ
-1.83
CDATA
-1.77
Ļ
-1.75
ĨĴ
-1.70
ĩ
-1.67
ense
-1.61
rapeut
-1.59
POSITIVE LOGITS
ousel
2.60
avan
2.35
illon
2.04
riages
1.95
ving
1.94
riers
1.93
rier
1.87
pool
1.76
ved
1.69
gos
1.68
Activations Density 0.112%