INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.22
0.7%
752
+0.06
0.2%
392
+0.05
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.22
0.00
0
-0.06
0.00
1
-0.05
0.00
Negative Logits
reluct
-8.80
increa
-8.51
impra
-8.48
shenan
-8.37
depic
-8.23
disagre
-8.21
encomp
-8.19
affor
-8.02
unspeak
-7.95
maneu
-7.93
POSITIVE LOGITS
<bos>
8.05
Walkover
3.12
Paglinawan
2.67
Himo
2.54
***!
2.52
Baillargeon
2.46
himo
2.45
'\\;'
2.43
insuffisamment
2.38
Réponses
2.34
Activations Density 0.000%
No Known Activations
This feature has no known activations.