INDEX
Explanations
phrases related to videos or visual recordings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
331
+0.15
0.5%
1870
+0.14
0.5%
1325
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1870
+0.15
0.03
168
+0.14
0.04
1137
+0.13
0.04
Negative Logits
Caratter
-0.63
Autore
-0.61
Longueur
-0.61
Flashcards
-0.60
Outils
-0.60
Sì
-0.59
viciss
-0.59
Hauteur
-0.59
charité
-0.58
Consigli
-0.57
POSITIVE LOGITS
piemē
0.68
mī
0.62
nepiecieš
0.60
dårlig
0.60
ījum
0.59
islation
0.58
ipment
0.58
footage
0.58
mør
0.57
izvē
0.57
Activations Density 0.271%