INDEX
Explanations
phrases related to personal stories and experiences in a military context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
184
+0.20
0.7%
1978
+0.14
0.4%
1013
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.20
0.04
1533
+0.14
0.02
1129
+0.13
0.05
Negative Logits
vogli
-1.12
lidl
-1.08
vito
-1.06
michelin
-1.05
bandung
-1.04
allarg
-1.03
haup
-1.02
casio
-1.02
peppa
-1.02
kosme
-1.00
POSITIVE LOGITS
[
0.61
really
0.61
everybody
0.59
kind
0.58
got
0.58
went
0.58
my
0.58
super
0.57
totally
0.56
stuff
0.56
Activations Density 0.306%