INDEX
Explanations
instances of storytelling or personal reflection
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
946
+0.11
0.3%
1978
+0.10
0.3%
468
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
977
+0.11
0.04
627
+0.10
0.04
1540
+0.10
0.03
Negative Logits
дописавши
-0.57
posób
-0.57
Juga
-0.57
ressemble
-0.55
jectures
-0.54
diskon
-0.53
Ainda
-0.52
dersfield
-0.52
confirme
-0.51
Kese
-0.50
POSITIVE LOGITS
aen
0.87
swarovski
0.86
meis
0.84
hairc
0.84
waer
0.81
blos
0.78
wien
0.78
cushi
0.77
geforce
0.76
myn
0.76
Activations Density 0.480%