INDEX
Explanations
proper nouns, particularly names of people and organizations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.25
0.9%
789
+0.08
0.3%
227
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.25
0.04
140
+0.08
0.04
4
+0.08
0.04
Negative Logits
<bos>
-3.01
abetes
-0.87
="#"><
-0.83
HasIndex
-0.83
himo
-0.80
</table>
-0.80
-0.80
ोंने
-0.78
mergeFrom
-0.78
новништво
-0.78
POSITIVE LOGITS
affor
2.71
maneu
2.64
increa
2.60
impra
2.48
reluct
2.45
milf
2.40
inev
2.40
accla
2.39
scrat
2.39
disagre
2.39
Activations Density 0.361%