INDEX
Explanations
verbs related to starting or beginning
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1392
+0.12
0.4%
1937
+0.10
0.3%
1425
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1392
+0.12
0.06
1381
+0.10
0.05
120
+0.10
0.04
Negative Logits
peppa
-0.60
komende
-0.59
pooh
-0.58
shenan
-0.57
Tweede
-0.56
ineffec
-0.55
milf
-0.54
vierge
-0.53
déchir
-0.52
bahia
-0.51
POSITIVE LOGITS
started
0.77
started
0.72
STARTED
0.68
start
0.66
Started
0.63
STARTED
0.63
began
0.63
START
0.61
starts
0.61
Started
0.60
Activations Density 0.108%