INDEX
Explanations
specific references to historical events, especially related to governance and social issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
776
+0.15
0.4%
1013
+0.14
0.4%
1967
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
321
+0.15
0.03
776
+0.14
0.03
1008
+0.11
0.02
Negative Logits
!!</
-0.92
„,
-0.92
pixabay
-0.91
lele
-0.90
idr
-0.85
vnd
-0.84
thut
-0.82
?</
-0.80
»>
-0.79
autunno
-0.78
POSITIVE LOGITS
century
0.81
th
0.64
century
0.63
entieth
0.60
Century
0.56
Jeez
0.55
decade
0.54
Thief
0.53
period
0.51
Bruh
0.51
Activations Density 0.042%