INDEX
Explanations
traffic regulations and driving guidelines
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.21
0.7%
764
+0.15
0.5%
1438
+0.14
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1438
+0.21
0.06
284
+0.15
0.07
1235
+0.14
0.06
Negative Logits
alkoh
-1.09
solidar
-1.03
kosme
-0.96
Strukt
-0.96
konserv
-0.94
geograf
-0.93
kollek
-0.93
kooper
-0.90
radikal
-0.89
Erreferentziak
-0.89
POSITIVE LOGITS
anyways
0.73
elsewhere
0.71
anyway
0.67
afterwards
0.67
later
0.65
thereafter
0.65
.
0.64
<eos>
0.64
instead
0.62
afterward
0.61
Activations Density 0.956%