INDEX
Explanations
mentions of specific locations or events in news articles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.23
0.8%
1967
+0.16
0.5%
50
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1980
+0.23
0.04
1362
+0.16
0.04
1505
+0.15
0.04
Negative Logits
hyperplasia
-0.57
szól
-0.57
tiks
-0.56
veiks
-0.56
bagay
-0.55
elashes
-0.54
tanong
-0.54
=="
-0.53
papild
-0.53
hvid
-0.52
POSITIVE LOGITS
Román
0.91
Antal
0.91
Áng
0.90
solidar
0.89
Juf
0.87
Nö
0.86
parteci
0.86
Apel
0.85
sappi
0.84
Gorb
0.83
Activations Density 0.130%