INDEX
Explanations
phrases related to power or empowerment
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
0.9%
341
+0.13
0.7%
478
+0.12
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
341
+0.17
0.04
1053
+0.13
0.03
1507
+0.12
0.03
Negative Logits
<bos>
-2.93
/***
-0.83
-0.75
/*!
-0.74
<?
-0.69
//---
-0.63
Vegeu
-0.62
/*
-0.61
#![
-0.60
//~
-0.59
POSITIVE LOGITS
bandung
1.31
napoli
1.25
chèvre
1.24
jaya
1.23
frambo
1.20
swarovski
1.18
broderie
1.17
frankfurt
1.17
ecru
1.17
milano
1.16
Activations Density 0.118%