INDEX
Explanations
locations, such as countries and cities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.21
0.8%
990
+0.07
0.3%
1983
+0.07
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1252
+0.21
0.10
227
+0.07
0.10
1479
+0.07
0.08
Negative Logits
<bos>
-2.57
ⓧ
-0.87
serve
-0.69
-0.66
/***
-0.63
lateinit
-0.61
/**
-0.61
ždý
-0.60
Kontrola
-0.59
apply
-0.59
POSITIVE LOGITS
Juf
1.69
bordeaux
1.63
maneu
1.51
Cæ
1.50
carrefour
1.48
marseille
1.48
emphat
1.47
Præ
1.47
wien
1.47
eiffel
1.44
Activations Density 1.253%