INDEX
Explanations
mentions of characters' names in a dialogue or interview setup
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
0.7%
1842
+0.15
0.6%
2015
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
836
+0.19
0.05
823
+0.15
0.06
510
+0.10
0.07
Negative Logits
<bos>
-2.59
-0.70
uzyskać
-0.68
/***
-0.68
ⓧ
-0.67
<?
-0.66
<!--
-0.62
/*
-0.61
<?
-0.59
springfox
-0.59
POSITIVE LOGITS
eiffel
1.10
cartier
1.03
stockholm
0.99
indestru
0.99
umbre
0.95
madonna
0.95
affor
0.93
effe
0.92
ecru
0.91
imposs
0.91
Activations Density 0.736%