INDEX
Explanations
phrases related to proposals, plans, implementation, and regulations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
131
+0.10
0.3%
1507
+0.09
0.3%
1491
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
131
+0.10
0.03
1983
+0.09
0.03
164
+0.09
0.03
Negative Logits
eniably
-0.48
evaluator
-0.47
hdashline
-0.47
pettico
-0.46
swarovski
-0.45
>|</
-0.45
Audiences
-0.44
Còn
-0.43
ecru
-0.42
diffs
-0.42
POSITIVE LOGITS
espri
0.63
kani
0.59
utaf
0.58
تضيفلها
0.57
facciamo
0.56
lingue
0.56
pama
0.55
teras
0.54
centavos
0.54
voleva
0.53
Activations Density 0.123%