INDEX
Explanations
words related to physical injuries
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
906
+0.13
0.4%
1499
+0.09
0.3%
736
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
847
+0.13
0.03
736
+0.09
0.05
1700
+0.08
0.03
Negative Logits
embodi
-1.22
oleo
-1.16
unce
-1.16
stefan
-1.15
pollut
-1.14
swarovski
-1.13
unden
-1.11
roberto
-1.10
haup
-1.10
erec
-1.09
POSITIVE LOGITS
<bos>
1.16
injury
0.62
recovery
0.59
recover
0.59
Recuper
0.57
rehab
0.56
Autoritní
0.56
Prensa
0.55
condition
0.54
injuries
0.54
Activations Density 0.372%