INDEX
Explanations
word forms related to specific periods of time and historical events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
841
+0.10
0.3%
1614
+0.09
0.3%
1899
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1614
+0.10
0.07
841
+0.09
0.04
1899
+0.09
0.05
Negative Logits
effe
-1.09
inder
-0.99
fte
-0.97
compen
-0.97
„,
-0.97
dispen
-0.96
doman
-0.95
mef
-0.95
nece
-0.95
ofre
-0.95
POSITIVE LOGITS
sixties
0.58
Soviet
0.57
polio
0.57
mistak
0.55
era
0.54
USSR
0.54
eniendo
0.52
decades
0.52
Beatles
0.52
agissait
0.52
Activations Density 0.360%