INDEX
Explanations
phrases related to support or suitability for specific purposes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
358
+0.13
0.7%
75
+0.12
0.6%
478
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
505
+0.13
0.07
358
+0.12
0.08
69
+0.10
0.08
Negative Logits
grounds
-1.54
wrought
-1.53
aside
-1.51
flags
-1.49
winds
-1.45
er
-1.41
Pradesh
-1.39
hats
-1.38
indicators
-1.37
statutes
-1.37
POSITIVE LOGITS
ice
1.60
thood
1.49
manufacture
1.47
use
1.40
li
1.38
Mobile
1.36
Use
1.35
deal
1.35
include
1.34
exclusively
1.33
Activations Density 0.599%