INDEX
Explanations
mentions of numbers, statistics, and analyses in news articles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.19
0.7%
50
+0.13
0.4%
555
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
752
+0.19
0.06
16
+0.13
0.07
1334
+0.12
0.05
Negative Logits
URBANA
-0.81
ⓧ
-0.65
mistak
-0.61
參考文獻
-0.60
relenting
-0.59
/**
-0.56
pastebin
-0.56
تانيه
-0.56
vectorielle
-0.53
<bos>
-0.53
POSITIVE LOGITS
renda
0.57
sorts
0.56
Milán
0.52
Jérusalem
0.51
inverno
0.49
OF
0.49
palab
0.49
ftra
0.48
senza
0.48
Of
0.46
Activations Density 0.302%