INDEX
Explanations
phrases related to quotations and statements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1008
+0.08
0.2%
284
+0.08
0.2%
792
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.08
0.06
62
+0.08
0.04
1925
+0.07
0.04
Negative Logits
unlaw
-1.28
encomp
-1.21
guarante
-1.21
reluct
-1.20
ftu
-1.18
accla
-1.18
volunte
-1.18
fte
-1.17
maneu
-1.17
sovere
-1.17
POSITIVE LOGITS
ioutil
0.63
<bos>
0.62
regarding
0.61
audiovisuel
0.61
aughey
0.60
ariConfig
0.59
Nazionale
0.58
鍮
0.57
about
0.55
ollection
0.54
Activations Density 0.292%