INDEX
Explanations
references to flu and influenza-related topics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
386
+0.16
0.9%
376
+0.13
0.8%
162
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
386
+0.16
0.01
78
+0.13
0.01
300
+0.12
0.01
Negative Logits
efore
-1.64
dear
-1.56
rapper
-1.55
hip
-1.53
hips
-1.48
distance
-1.48
lessly
-1.46
atively
-1.43
´
-1.42
ward
-1.42
POSITIVE LOGITS
iero
1.78
filings
1.76
burg
1.57
stadt
1.53
share
1.51
bank
1.49
enza
1.49
state
1.48
idity
1.48
ido
1.48
Activations Density 1.568%