INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.28
1.2%
1535
+0.06
0.2%
906
+0.05
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.28
0.00
0
-0.06
0.00
1
-0.05
0.00
Negative Logits
unspeak
-1.80
belliger
-1.59
ruinous
-1.59
miscon
-1.53
exasper
-1.51
ineffectual
-1.46
odious
-1.46
demoral
-1.44
horrend
-1.42
laug
-1.42
POSITIVE LOGITS
<bos>
10.75
GEBURTSDATUM
2.33
expandindo
2.26
betweenstory
2.25
kasarigan
2.12
Autoritní
2.06
Italijani
1.96
'\\;'
1.89
Paglinawan
1.87
Geplaatst
1.86
Activations Density 0.000%
No Known Activations
This feature has no known activations.