INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
184
+0.17
0.9%
122
+0.14
0.8%
363
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
35
+0.17
0.11
371
+0.14
0.12
18
+0.13
0.08
Negative Logits
onder
-1.56
enth
-1.42
dimethyl
-1.42
aston
-1.39
going
-1.39
"}](#
-1.39
vier
-1.39
'$.
-1.39
severe
-1.38
ouss
-1.37
POSITIVE LOGITS
predecessor
1.60
pals
1.48
favourite
1.47
@
1.47
front
1.46
nem
1.45
isans
1.45
critics
1.45
predecessors
1.43
caption
1.42
Activations Density 0.214%
No Known Activations
This feature has no known activations.