INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.33
1.2%
617
+0.07
0.3%
394
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.33
0.00
0
-0.07
0.00
1
-0.06
0.00
Negative Logits
unspeak
-4.37
shenan
-4.24
reluct
-4.24
impra
-4.08
disagre
-4.01
increa
-3.85
indestru
-3.81
apprehen
-3.80
depic
-3.78
uninten
-3.71
POSITIVE LOGITS
<bos>
10.23
Paglinawan
2.09
GEBURTSDATUM
2.09
Himo
2.00
expandindo
1.98
betweenstory
1.94
Italijani
1.94
Walkover
1.93
Autoritní
1.80
rungsseite
1.80
Activations Density 0.000%
No Known Activations
This feature has no known activations.