INDEX
Explanations
phrases related to physical activities and achievements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.22
1.5%
478
+0.16
1.1%
1967
+0.14
1.0%
Correlated Neurons
Index
P. Corr.
Cos Sim.
478
+0.22
0.16
1967
+0.16
0.12
1108
+0.14
0.13
Negative Logits
<bos>
-4.54
ⓧ
-1.34
<?
-1.22
-1.17
/***
-1.12
/**
-1.09
/*!
-0.97
intersper
-0.93
disbur
-0.88
springfox
-0.87
POSITIVE LOGITS
corrom
0.86
seksi
0.86
tristes
0.78
marea
0.76
maroc
0.76
saar
0.75
ceramica
0.74
vasi
0.74
uhr
0.74
kafe
0.74
Activations Density 1.335%