INDEX
Explanations
phrases related to limits and boundaries
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
313
+0.12
0.4%
528
+0.11
0.4%
241
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
313
+0.12
0.02
241
+0.11
0.02
1516
+0.10
0.02
Negative Logits
stoff
-0.68
akti
-0.68
alkoh
-0.67
kram
-0.66
kado
-0.64
kras
-0.64
makro
-0.62
kog
-0.61
Kategor
-0.61
cso
-0.60
POSITIVE LOGITS
limit
1.18
limit
1.14
limits
1.13
Limit
1.12
Limit
1.11
Limits
1.05
LIMIT
1.01
limits
0.99
Limits
0.97
LIMIT
0.96
Activations Density 0.056%