INDEX
Explanations
references to movies and movie-related contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
284
+0.08
0.2%
2045
+0.08
0.2%
205
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.08
0.07
2045
+0.08
0.05
2044
+0.07
0.06
Negative Logits
meis
-1.22
aen
-1.19
ivi
-1.18
„,
-1.17
unan
-1.17
lele
-1.15
dises
-1.15
pessi
-1.14
secon
-1.13
Juf
-1.13
POSITIVE LOGITS
才能
0.93
before
0.80
to
0.80
כדי
0.78
чтобы
0.78
để
0.74
because
0.72
来
0.71
unless
0.70
must
0.69
Activations Density 0.498%