INDEX
Explanations
phrases related to critical analysis and refutation in a political context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
604
+0.12
0.3%
940
+0.09
0.2%
752
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
100
+0.12
0.04
16
+0.09
0.06
584
+0.08
0.03
Negative Logits
<bos>
-1.07
OnEvent
-0.61
openModal
-0.60
inev
-0.56
disagre
-0.55
cuck
-0.55
closeConnection
-0.55
Yess
-0.55
UwU
-0.54
Hahah
-0.54
POSITIVE LOGITS
CiNii
0.87
truk
0.72
akku
0.69
puto
0.67
fasi
0.67
mahd
0.66
estekak
0.65
lapto
0.64
maksi
0.63
Église
0.63
Activations Density 0.425%