INDEX
Explanations
specific names of individuals mentioned in the text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.18
0.6%
1919
+0.15
0.5%
478
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1919
+0.18
0.08
1097
+0.15
0.06
1650
+0.10
0.04
Negative Logits
affez
-1.12
parati
-0.94
anticipo
-0.87
marrone
-0.86
palab
-0.84
cioc
-0.82
abito
-0.82
palio
-0.82
soggior
-0.82
tph
-0.81
POSITIVE LOGITS
himself
0.75
'
0.74
’
0.68
had
0.57
URBANA
0.55
himself
0.55
was
0.54
Jego
0.51
has
0.51
Himself
0.50
Activations Density 0.157%