INDEX
Explanations
prepositions followed by nouns indicating a connection or an action
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.25
1.5%
687
+0.13
0.8%
1984
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
687
+0.25
0.16
1624
+0.13
0.12
1984
+0.11
0.13
Negative Logits
<bos>
-3.66
ⓧ
-0.73
<?
-0.70
/**
-0.64
///**
-0.64
-0.63
SystemColors
-0.62
IsRequired
-0.62
var
-0.61
public
-0.61
POSITIVE LOGITS
wien
1.70
stockholm
1.67
maneu
1.61
riviera
1.61
affor
1.60
eiffel
1.59
lidl
1.57
increa
1.53
Juf
1.53
squa
1.52
Activations Density 0.940%