INDEX
Explanations
instructions or invitations to join various communities or activities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.18
1.1%
1233
+0.14
0.8%
597
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1233
+0.18
0.04
597
+0.14
0.03
1387
+0.14
0.03
Negative Logits
<bos>
-3.03
rungsseite
-0.86
public
-0.80
lateinit
-0.79
char
-0.78
-0.78
-0.77
ValueGenerated
-0.77
protected
-0.76
-0.74
POSITIVE LOGITS
increa
2.26
unlaw
2.18
impra
2.16
affor
2.16
accla
2.06
Juf
2.06
stockholm
2.06
disagre
2.05
guarante
2.05
reluct
2.05
Activations Density 0.117%