INDEX
Explanations
locations and historical events delineated by dates
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
919
+0.09
0.2%
1847
+0.09
0.2%
86
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.09
0.06
1244
+0.09
0.03
1196
+0.08
0.02
Negative Logits
sement
-0.71
hunde
-0.62
Italijani
-0.60
garota
-0.58
senhora
-0.57
odkazy
-0.56
righe
-0.56
lapto
-0.56
nakalista
-0.55
gradova
-0.55
POSITIVE LOGITS
Middles
0.69
female
0.67
Glou
0.66
Valky
0.66
wife
0.65
fameux
0.65
shenan
0.65
woman
0.64
mécanisme
0.64
Wife
0.64
Activations Density 0.603%