INDEX
Explanations
short textual excerpts regarding a variety of topics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
394
+0.45
1.8%
1108
+0.20
0.8%
1577
+0.18
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
394
+0.45
0.06
184
+0.20
0.01
599
+0.18
0.07
Negative Logits
FTFY
-0.82
Kün
-0.71
Noice
-0.68
Schrö
-0.68
Rgds
-0.67
Lmfao
-0.66
lamborghini
-0.64
munich
-0.64
alkoh
-0.64
Lma
-0.63
POSITIVE LOGITS
ьаж
0.55
pushFollow
0.49
ویکیپدی
0.47
Географиясе
0.47
RTEE
0.45
➞
0.42
HttpNotFound
0.41
Reentrant
0.39
DatabaseError
0.39
wikipagina
0.39
Activations Density 1.105%