INDEX
Explanations
references to a particular time or repetition of actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
1.1%
1671
+0.16
0.9%
1406
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1406
+0.19
0.05
1671
+0.16
0.05
437
+0.13
0.05
Negative Logits
<bos>
-3.01
/***
-0.76
ⓧ
-0.69
-0.68
///**
-0.67
<?
-0.67
/*!
-0.64
дописавши
-0.61
<?
-0.61
export
-0.61
POSITIVE LOGITS
jaya
1.43
kram
1.41
wien
1.39
stockholm
1.35
lele
1.33
haup
1.28
unce
1.27
seksi
1.26
frankfurt
1.26
kasa
1.26
Activations Density 0.089%