INDEX
Explanations
references to official meetings and discussions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1233
+0.15
0.5%
1678
+0.13
0.4%
136
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
136
+0.15
0.03
1233
+0.13
0.04
1336
+0.11
0.04
Negative Logits
dicono
-0.54
brille
-0.52
ANCHE
-0.49
ananas
-0.49
lepid
-0.48
prouve
-0.48
Ephe
-0.48
vogli
-0.48
diable
-0.48
nomme
-0.47
POSITIVE LOGITS
meeting
1.27
meetings
1.21
meeting
1.20
Meeting
1.19
Meeting
1.17
Meetings
1.11
meetings
1.10
MEETING
1.04
Meetings
1.04
meet
0.84
Activations Density 0.076%