INDEX
Explanations
phrases or words related to textual materials and content
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
568
+0.15
0.6%
1870
+0.14
0.5%
101
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
568
+0.15
0.03
101
+0.14
0.02
866
+0.13
0.02
Negative Logits
indestru
-0.82
shenan
-0.81
intersper
-0.80
cushi
-0.78
encomp
-0.75
tupperware
-0.73
increa
-0.73
affor
-0.71
swarovski
-0.71
hairc
-0.71
POSITIVE LOGITS
material
1.32
material
1.32
Material
1.20
Material
1.20
materials
1.19
materials
1.15
Materials
1.06
MATERIAL
1.06
MATERIAL
1.05
aterial
1.04
Activations Density 0.047%