INDEX
Explanations
mentions of the letter 's'
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.39
1.8%
478
+0.11
0.5%
1506
+0.10
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1581
+0.39
0.05
421
+0.11
0.06
276
+0.10
0.06
Negative Logits
<bos>
-2.81
/**
-0.90
/*
-0.80
<?
-0.72
StringCopy
-0.66
-0.65
/***
-0.64
HasIndex
-0.64
муніципалі
-0.64
lateinit
-0.64
POSITIVE LOGITS
affor
1.82
Juf
1.72
unlaw
1.65
increa
1.64
sovere
1.60
accla
1.60
emphat
1.53
impractica
1.52
milf
1.51
inev
1.51
Activations Density 0.195%