INDEX
Explanations
the word "loose" in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
444
+0.14
0.8%
172
+0.14
0.8%
111
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
444
+0.14
0.01
172
+0.14
0.01
476
+0.13
0.01
Negative Logits
brighter
-1.48
ories
-1.46
brave
-1.41
lack
-1.41
slightest
-1.38
wonderful
-1.37
wonders
-1.37
ris
-1.36
ease
-1.35
brightly
-1.35
POSITIVE LOGITS
ģ
3.84
Ģ
3.73
Ľ
3.59
ł
3.58
¡
3.48
ĻĤ
3.40
¼
3.39
½
3.36
®
3.35
¾
3.32
Activations Density 0.024%