INDEX
Explanations
information about historical places or events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
872
+0.16
0.5%
1535
+0.16
0.5%
382
+0.16
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.16
0.10
1535
+0.16
0.08
1959
+0.16
0.08
Negative Logits
lapto
-0.94
solidar
-0.89
toal
-0.86
demen
-0.85
notor
-0.85
utop
-0.82
hamburg
-0.81
hoj
-0.79
hek
-0.76
ideolog
-0.76
POSITIVE LOGITS
Hence
0.64
Thus
0.63
Therefore
0.62
Thereafter
0.61
The
0.61
Later
0.60
Now
0.60
Subsequently
0.60
This
0.60
Centuries
0.59
Activations Density 0.577%