INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.39
2.2%
1741
+0.05
0.3%
1535
+0.04
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.39
0.00
0
-0.05
0.00
1
-0.04
0.00
Negative Logits
unspeak
-2.85
reluct
-2.71
disgra
-2.70
unlaw
-2.54
disagre
-2.54
impra
-2.51
shenan
-2.48
ineffec
-2.47
horrend
-2.44
apprehen
-2.42
POSITIVE LOGITS
<bos>
13.31
GEBURTSDATUM
2.39
expandindo
2.38
Autoritní
2.20
betweenstory
2.17
Administrativna
2.09
تقاوى
2.04
autorytatywna
2.01
Italijani
2.01
kasarigan
1.98
Activations Density 0.000%
No Known Activations
This feature has no known activations.