INDEX
Explanations
phrases conveying strong emotional sentiments
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
381
+0.11
0.6%
479
+0.11
0.6%
78
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
237
+0.11
0.07
170
+0.11
0.06
421
+0.11
0.05
Negative Logits
Tex
-1.64
ôle
-1.61
↵³³
-1.54
ération
-1.50
Vermont
-1.42
ître
-1.41
Å¡
-1.40
ô
-1.37
asting
-1.36
Sgt
-1.36
POSITIVE LOGITS
paces
1.49
ARY
1.45
ystem
1.43
lining
1.43
arial
1.41
contrary
1.39
iae
1.39
erated
1.38
eri
1.35
Nations
1.33
Activations Density 0.107%