INDEX
Explanations
descriptions of features and functionalities related to digital tools or technologies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.14
0.5%
966
+0.12
0.4%
1870
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2030
+0.14
0.02
188
+0.12
0.02
966
+0.11
0.02
Negative Logits
indestru
-1.10
philanth
-0.90
intersper
-0.84
disgra
-0.81
maneu
-0.80
indoc
-0.80
encomp
-0.79
inconce
-0.79
disagre
-0.79
emphat
-0.78
POSITIVE LOGITS
capabilities
1.10
abilities
1.03
capability
0.97
ABILITIES
0.87
Capabilities
0.80
capabilities
0.80
Abilities
0.80
abilities
0.77
Capabilities
0.77
ability
0.77
Activations Density 0.114%