INDEX
Explanations
phrases related to bureaucratic processes or paperwork
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.16
0.5%
946
+0.11
0.3%
1553
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1553
+0.16
0.04
1150
+0.11
0.02
946
+0.11
0.03
Negative Logits
shadowRadius
-0.63
};*/
-0.62
mourut
-0.60
Drapeau
-0.59
})*/
-0.58
!*\
-0.57
setBold
-0.54
ImageBackground
-0.54
allAfrica
-0.54
aarrggbb
-0.53
POSITIVE LOGITS
suscep
0.90
apprehen
0.89
unspeak
0.85
gaily
0.82
excru
0.81
disagre
0.79
horrend
0.78
indescri
0.78
disreg
0.78
scrat
0.76
Activations Density 0.258%