INDEX
Explanations
text related to work, particularly in an occupational context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
47
+0.10
0.3%
198
+0.10
0.3%
581
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
392
+0.10
0.02
47
+0.10
0.03
1229
+0.07
0.02
Negative Logits
kompakt
-0.59
hek
-0.53
elek
-0.52
архивлан
-0.52
geograf
-0.51
logis
-0.51
intrig
-0.49
Obt
-0.49
inverte
-0.48
Filt
-0.48
POSITIVE LOGITS
<bos>
1.01
WHETHER
0.76
whether
0.71
whether
0.69
Whether
0.69
Whether
0.64
either
0.62
bonté
0.59
Fuckin
0.58
mondeo
0.58
Activations Density 0.124%