INDEX
Explanations
certain categories or classifications of content, such as editorial columns, commencement ceremonies, and op-eds
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.18
0.7%
1150
+0.10
0.4%
783
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1150
+0.18
0.04
1003
+0.10
0.06
647
+0.10
0.05
Negative Logits
<bos>
-2.08
ⓧ
-1.02
/**
-0.93
-0.89
<?
-0.84
/***
-0.79
///**
-0.75
/*
-0.74
Transcripción
-0.61
/*!
-0.60
POSITIVE LOGITS
Minang
0.86
thuy
0.86
Palembang
0.78
yong
0.75
Somal
0.72
maroc
0.72
montagna
0.72
Tanjung
0.71
Juventud
0.71
Czechos
0.71
Activations Density 0.803%