INDEX
Explanations
phrases related to competition or rivalries
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
442
+0.16
0.8%
555
+0.14
0.7%
161
+0.13
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1363
+0.16
0.03
442
+0.14
0.04
1515
+0.13
0.02
Negative Logits
<bos>
-1.61
//---
-0.61
<?
-0.61
ઊ
-0.53
kulum
-0.52
-0.52
/*!
-0.52
/***
-0.51
pessoas
-0.51
//...
-0.50
POSITIVE LOGITS
Rival
1.23
Rival
1.19
rival
1.18
rivals
1.12
rival
1.11
ecru
1.10
Rivals
0.92
stockholm
0.91
madonna
0.89
🤣🤣
0.88
Activations Density 0.445%