INDEX
Explanations
instances where the word "facing" is followed by a concept or situation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.20
1.0%
991
+0.11
0.5%
1133
+0.09
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
991
+0.20
0.04
1352
+0.11
0.04
781
+0.09
0.04
Negative Logits
<bos>
-3.19
,
-0.78
.
-0.76
else
-0.75
}
-0.75
bot
-0.74
public
-0.74
;
-0.73
|}
-0.73
switch
-0.73
POSITIVE LOGITS
affor
2.40
increa
2.33
stockholm
2.30
lidl
2.22
fta
2.18
guarante
2.17
nece
2.14
milf
2.13
ftu
2.12
bangkok
2.09
Activations Density 0.233%