INDEX
Explanations
phrases related to narrow concepts or limited perspectives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1376
+0.15
0.5%
897
+0.12
0.4%
1416
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1376
+0.15
0.02
1218
+0.12
0.02
240
+0.12
0.02
Negative Logits
citenamefont
-0.50
HttpPut
-0.50
adaptiveStyles
-0.49
ExecuteAsync
-0.49
-------
-0.46
aleg
-0.46
buk
-0.45
TextFormField
-0.44
WindowConstants
-0.44
Viitteet
-0.43
POSITIVE LOGITS
Narrow
1.30
Narrow
1.28
narrow
1.24
narrow
1.16
narrowing
1.09
narrows
1.08
narrowed
1.07
narrower
0.95
narrowly
0.90
Lmfao
0.73
Activations Density 0.067%