INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.24
0.8%
1802
+0.06
0.2%
1535
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.24
0.00
0
-0.06
0.00
1
-0.06
0.00
Negative Logits
despotism
-0.82
odious
-0.79
ineffectual
-0.78
ruinous
-0.77
disgraceful
-0.76
impotent
-0.76
pylab
-0.76
traitors
-0.75
massacres
-0.73
infuriating
-0.73
POSITIVE LOGITS
<bos>
7.55
expandindo
1.31
Administrativna
1.18
ordina
1.16
fte
1.14
betweenstory
1.13
blos
1.12
fta
1.11
oun
1.10
GEBURTSDATUM
1.10
Activations Density 0.000%
No Known Activations
This feature has no known activations.