INDEX
Explanations
mentions of different types of exits
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
204
+0.14
0.5%
1416
+0.12
0.4%
501
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
204
+0.14
0.02
1416
+0.12
0.02
124
+0.12
0.02
Negative Logits
pudesse
-0.44
tonode
-0.41
setCell
-0.40
Stephen
-0.39
ESOME
-0.38
biamo
-0.38
Stephen
-0.38
julho
-0.38
pemas
-0.37
řev
-0.36
POSITIVE LOGITS
exit
1.25
Exit
1.24
exit
1.14
EXIT
1.13
Exit
1.11
exits
1.09
exited
1.01
EXIT
1.00
exits
0.93
exiting
0.86
Activations Density 0.102%