INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.26
0.8%
1108
+0.04
0.1%
1860
+0.04
0.1%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.26
0.00
0
-0.04
0.00
1
-0.04
0.00
Negative Logits
reluct
-8.82
shenan
-8.59
impra
-8.52
increa
-8.44
depic
-8.25
disagre
-8.25
encomp
-8.24
unspeak
-8.08
maneu
-7.95
affor
-7.94
POSITIVE LOGITS
<bos>
9.41
Walkover
3.37
Paglinawan
3.04
Himo
3.00
himo
2.81
-------------</
2.71
***!
2.71
Autoritní
2.67
Shetterly
2.66
Baillargeon
2.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.