INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.38
2.3%
1741
+0.07
0.4%
1870
+0.04
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.38
0.00
0
-0.07
0.00
1
-0.04
0.00
Negative Logits
unspeak
-3.26
reluct
-3.12
shenan
-3.00
unlaw
-2.98
disgra
-2.96
disagre
-2.93
impra
-2.89
impractica
-2.89
horrend
-2.82
apprehen
-2.79
POSITIVE LOGITS
<bos>
13.46
GEBURTSDATUM
2.59
expandindo
2.41
betweenstory
2.38
Autoritní
2.29
تقاوى
2.13
kasarigan
2.12
Paglinawan
2.09
Administrativna
2.09
Italijani
2.08
Activations Density 0.000%
No Known Activations
This feature has no known activations.