INDEX
Explanations
references to a specific individual named "Robinson."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
68
+0.12
0.7%
256
+0.11
0.6%
379
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
379
+0.12
0.01
68
+0.11
0.01
66
+0.11
0.01
Negative Logits
illed
-1.61
urred
-1.52
thee
-1.50
ASE
-1.46
slightest
-1.40
uv
-1.39
moderate
-1.38
]>
-1.37
quired
-1.37
Č
-1.36
POSITIVE LOGITS
stown
1.92
vill
1.84
enstein
1.73
ship
1.69
plete
1.61
force
1.59
defense
1.59
ian
1.59
pora
1.58
ist
1.58
Activations Density 0.014%