INDEX
Explanations
locations or cities mentioned in news articles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.12
0.3%
1741
+0.11
0.3%
110
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
648
+0.12
0.04
110
+0.11
0.04
1852
+0.10
0.03
Negative Logits
nutella
-1.73
ecru
-1.71
hairc
-1.69
tupperware
-1.67
cushi
-1.66
increa
-1.64
swarovski
-1.63
disagre
-1.58
withal
-1.57
impra
-1.56
POSITIVE LOGITS
höl
0.76
ortop
0.70
kosme
0.68
defekt
0.68
dras
0.67
minimalis
0.66
biograf
0.66
Fö
0.66
kritis
0.66
solidar
0.65
Activations Density 0.120%