INDEX
Explanations
words related to dedication and commitment
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
370
+0.11
0.4%
553
+0.11
0.4%
900
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
270
+0.11
0.02
370
+0.11
0.02
900
+0.10
0.02
Negative Logits
nece
-0.54
intellig
-0.50
reger
-0.49
paus
-0.47
omnes
-0.46
effe
-0.46
laun
-0.45
revan
-0.45
Orozco
-0.45
NOWLED
-0.45
POSITIVE LOGITS
dedicated
1.23
dedicate
1.21
dedicated
1.13
Dedicated
1.12
dedication
1.08
Dedicated
1.08
devoted
1.04
dedic
1.03
devote
0.96
Dedication
0.95
Activations Density 0.058%