INDEX
Explanations
phrases related to physical actions or movements involving various characters
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
897
+0.13
0.4%
1416
+0.12
0.4%
674
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1416
+0.13
0.05
897
+0.12
0.04
1984
+0.12
0.05
Negative Logits
thut
-1.27
chong
-1.24
fte
-1.23
jati
-1.21
wherea
-1.21
bandung
-1.21
perciò
-1.19
fta
-1.19
?...
-1.19
effe
-1.17
POSITIVE LOGITS
in
0.71
In
0.59
IN
0.59
into
0.58
GEBURTS
0.56
in
0.56
<<<<<<<<<<<<<<
0.55
getIn
0.55
Bekijk
0.53
In
0.53
Activations Density 0.204%