INDEX
Explanations
mentions of specific locations or organizations in news articles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.31
1.8%
1741
+0.13
0.8%
2019
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.31
0.07
76
+0.13
0.06
1265
+0.13
0.05
Negative Logits
<bos>
-3.47
ⓧ
-1.21
<?
-1.02
-1.01
/**
-0.97
/***
-0.94
/*
-0.82
<?
-0.76
disbur
-0.73
USTAIN
-0.68
POSITIVE LOGITS
Presenta
0.78
véhic
0.75
seksi
0.75
Juf
0.70
Cerca
0.69
expériment
0.68
miniatura
0.68
Contribu
0.67
catég
0.67
pleins
0.67
Activations Density 0.431%