INDEX
Explanations
references to natural products or substances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
438
+0.14
0.8%
208
+0.13
0.7%
156
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
208
+0.14
0.03
438
+0.13
0.03
392
+0.11
0.03
Negative Logits
Ĭ
-1.84
Ľ
-1.79
Į
-1.67
her
-1.60
artan
-1.54
ľ
-1.52
ucht
-1.48
oof
-1.48
equal
-1.47
ott
-1.47
POSITIVE LOGITS
istic
2.20
istically
2.08
izations
1.94
isations
1.93
sciences
1.92
ness
1.80
isation
1.72
itat
1.72
denominator
1.69
izing
1.69
Activations Density 0.079%