INDEX
Explanations
military-related terms and organizational structures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
755
+0.12
0.4%
658
+0.11
0.4%
1778
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1778
+0.12
0.06
658
+0.11
0.07
755
+0.10
0.07
Negative Logits
Kleine
-0.61
<bos>
-0.57
dras
-0.56
konk
-0.55
fisk
-0.55
dek
-0.55
Neder
-0.54
katal
-0.54
kant
-0.52
kard
-0.51
POSITIVE LOGITS
inappro
0.93
embodi
0.93
Darío
0.92
hornblende
0.90
ecru
0.89
Mejía
0.88
quartzite
0.87
hairc
0.87
depic
0.86
Mónica
0.86
Activations Density 0.601%