INDEX
Explanations
references to a specific person or name
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
59
+0.17
1.0%
338
+0.15
0.8%
444
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
59
+0.17
0.02
410
+0.15
0.03
338
+0.14
0.01
Negative Logits
ahu
-1.45
moving
-1.45
same
-1.44
cep
-1.41
odor
-1.40
applic
-1.40
blogger
-1.38
slightest
-1.38
PLIED
-1.36
ghosts
-1.35
POSITIVE LOGITS
ī
2.45
¥
2.25
ģ
2.10
·
2.06
µ
2.06
¦
2.04
Ķ
2.02
©
1.98
Ĵ
1.96
ª
1.96
Activations Density 0.130%