INDEX
Explanations
words related to online storage and accessibility
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
284
+0.11
0.3%
1403
+0.10
0.3%
690
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.11
0.06
1446
+0.10
0.03
766
+0.09
0.04
Negative Logits
disagre
-1.78
thut
-1.76
encomp
-1.74
increa
-1.71
intersper
-1.70
inev
-1.68
reluct
-1.65
ftu
-1.63
depic
-1.62
unden
-1.56
POSITIVE LOGITS
anywhere
0.69
mobile
0.68
browser
0.66
ktop
0.61
wherever
0.59
aarrggbb
0.58
ynb
0.57
smartphone
0.56
tablet
0.56
الرياضيه
0.56
Activations Density 0.333%