INDEX
Explanations
references to a specific location or ranking
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1491
+0.12
0.4%
1381
+0.12
0.4%
421
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1491
+0.12
0.03
1381
+0.12
0.03
78
+0.11
0.02
Negative Logits
pompa
-0.54
interag
-0.54
interrom
-0.54
furg
-0.52
kask
-0.51
rapor
-0.51
panik
-0.50
úde
-0.50
komik
-0.50
sedia
-0.50
POSITIVE LOGITS
Placing
1.13
placed
1.10
placement
1.07
placing
1.06
placed
1.05
placements
1.04
Placement
0.96
Placed
0.94
PLACE
0.93
Place
0.93
Activations Density 0.079%