INDEX
Explanations
words related to punditry and expert analysis
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1194
+0.18
0.7%
479
+0.14
0.5%
1350
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1194
+0.18
0.06
1778
+0.14
0.05
75
+0.13
0.04
Negative Logits
najbol
-0.59
maksi
-0.59
maraming
-0.57
meni
-0.55
ilang
-0.53
akku
-0.53
cheiben
-0.53
itong
-0.53
šte
-0.52
siyang
-0.52
POSITIVE LOGITS
P
0.67
p
0.61
P
0.61
getP
0.59
getP
0.57
p
0.53
pu
0.52
ykjav
0.49
Pim
0.48
pozi
0.48
Activations Density 0.581%