INDEX
Explanations
contact information such as addresses and phone numbers
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
2019
+0.28
0.8%
381
+0.13
0.4%
924
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2019
+0.28
0.11
924
+0.13
0.10
536
+0.11
0.07
Negative Logits
rektur
-0.57
truk
-0.55
INERY
-0.53
oneph
-0.53
tilizer
-0.52
Assista
-0.50
etti
-0.50
Significado
-0.49
ांकि
-0.49
转发
-0.49
POSITIVE LOGITS
embra
1.12
dispen
1.06
lts
1.03
fep
1.01
squa
1.00
apparti
0.97
wherea
0.96
fte
0.96
milf
0.95
fuf
0.95
Activations Density 0.502%