INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.22
0.7%
207
+0.05
0.2%
621
+0.05
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.22
0.00
0
-0.05
0.00
1
-0.05
0.00
Negative Logits
unlaw
-1.03
belliger
-0.94
panik
-0.93
disgra
-0.92
unspeak
-0.91
impractica
-0.90
unwarran
-0.89
EEU
-0.88
Yugos
-0.87
odious
-0.87
POSITIVE LOGITS
<bos>
8.00
GEBURTSDATUM
1.27
expandindo
1.25
kasarigan
1.25
تقاوى
1.18
betweenstory
1.07
Autoritní
1.06
Personendaten
1.05
rungsseite
1.04
Мексичка
1.03
Activations Density 0.000%
No Known Activations
This feature has no known activations.