INDEX
Explanations
words related to helicopters
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1677
+0.17
0.6%
990
+0.13
0.5%
1127
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1677
+0.17
0.04
1127
+0.13
0.03
990
+0.13
0.03
Negative Logits
Todavía
-0.49
Sigue
-0.48
Puedo
-0.48
Grü
-0.48
Quien
-0.47
Straße
-0.46
cstdlib
-0.44
cassert
-0.43
Alguna
-0.43
туга
-0.43
POSITIVE LOGITS
Hez
0.85
territo
0.83
HEL
0.83
HE
0.78
Heli
0.77
hés
0.77
Hel
0.77
HE
0.75
Hé
0.74
Helico
0.74
Activations Density 0.222%