INDEX
Explanations
phrases related to job roles and responsibilities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1385
+0.11
0.3%
1967
+0.10
0.3%
1150
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.11
0.07
59
+0.10
0.04
1166
+0.10
0.06
Negative Logits
volunte
-0.93
Rgds
-0.88
encomp
-0.87
sovere
-0.87
unlaw
-0.85
impractica
-0.79
migli
-0.79
Біо
-0.79
philanth
-0.77
inev
-0.77
POSITIVE LOGITS
.
0.59
DoubleQuotes
0.57
écution
0.57
.
0.56
gawas
0.55
AssertionError
0.55
NSCoder
0.55
IContainer
0.54
ariConfig
0.54
komik
0.54
Activations Density 0.403%