INDEX
Explanations
words related to famous individuals, particularly actors and entertainers
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
227
+0.13
0.4%
16
+0.11
0.3%
1013
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
678
+0.13
0.05
16
+0.11
0.06
227
+0.10
0.06
Negative Logits
商品説明
-0.63
BeforeAll
-0.62
RUnlock
-0.60
himo
-0.60
""],
-0.59
sizeCache
-0.58
IUrlHelper
-0.58
++)
-0.57
CopyWith
-0.56
الحره
-0.56
POSITIVE LOGITS
Sén
1.08
Cfr
1.06
UwU
1.04
vagu
1.02
pié
1.01
Noice
1.01
Perci
1.00
Sì
1.00
sappi
0.99
Lma
0.98
Activations Density 0.338%