INDEX
Explanations
places or locations within a story
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.18
0.5%
906
+0.10
0.3%
227
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.18
0.06
1013
+0.10
0.06
1524
+0.09
0.04
Negative Logits
michelin
-0.84
scrat
-0.83
plenti
-0.82
swee
-0.82
disreg
-0.78
panama
-0.78
gild
-0.77
inext
-0.77
frankfurt
-0.77
impra
-0.75
POSITIVE LOGITS
headquarters
0.81
premises
0.73
offices
0.73
dataclass
0.61
office
0.61
principalColumn
0.59
home
0.59
residence
0.58
Headquarters
0.58
peniten
0.55
Activations Density 0.396%