INDEX
Explanations
phrases surrounded by quotation marks
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.21
1.2%
2019
+0.12
0.7%
545
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
545
+0.21
0.12
1404
+0.12
0.10
82
+0.12
0.09
Negative Logits
<bos>
-3.69
ⓧ
-1.13
-1.01
<?
-0.99
/**
-0.93
дописавши
-0.80
/*
-0.77
/***
-0.72
springfox
-0.70
ohist
-0.69
POSITIVE LOGITS
unlaw
1.21
impractica
1.20
Jambi
1.12
sovere
1.12
santiago
1.11
valencia
1.10
dison
1.10
practition
1.09
Portugu
1.08
Juf
1.08
Activations Density 0.311%