INDEX
Explanations
individuals who are influential or noteworthy in specific fields or contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
263
+0.17
1.0%
56
+0.15
0.9%
23
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
23
+0.17
0.13
56
+0.15
-0.01
331
+0.14
0.08
Negative Logits
itory
-1.61
iom
-1.57
OPLE
-1.56
^](#
-1.54
isons
-1.46
){#-1.43
zione
-1.42
]:
-1.42
:`
-1.41
ICT
-1.40
POSITIVE LOGITS
himself
2.34
himself
2.10
his
1.80
beard
1.63
Willi
1.62
wife
1.56
stood
1.46
sparing
1.46
joking
1.40
bald
1.35
Activations Density 1.782%