INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.28
1.0%
1648
+0.04
0.2%
200
+0.04
0.1%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.28
0.00
0
-0.04
0.00
1
-0.04
0.00
Negative Logits
reluct
-4.43
shenan
-4.34
impra
-4.32
unspeak
-4.31
increa
-4.23
disagre
-4.14
depic
-4.05
maneu
-4.00
apprehen
-3.92
indestru
-3.92
POSITIVE LOGITS
<bos>
9.37
Paglinawan
2.10
Walkover
2.08
betweenstory
2.07
expandindo
2.02
'\\;'
2.01
GEBURTSDATUM
2.00
Autoritní
1.94
Himo
1.92
脚注の使い方
1.91
Activations Density 0.000%
No Known Activations
This feature has no known activations.