INDEX
Explanations
keywords related to specific game mechanics or rules, possibly related to strategy or instructions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
281
+0.16
0.7%
1059
+0.14
0.6%
528
+0.14
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
281
+0.16
0.04
1059
+0.14
0.04
528
+0.14
0.03
Negative Logits
timbangan
-0.54
zobaczyć
-0.46
NamedQueries
-0.46
SneakyThrows
-0.45
secuted
-0.45
initComponents
-0.45
CDCl
-0.45
Bagaimana
-0.44
zostać
-0.44
enaam
-0.44
POSITIVE LOGITS
PER
1.14
Per
1.13
Per
1.08
exé
1.05
permu
0.99
per
0.99
rempliss
0.97
perpé
0.96
Persi
0.96
Ename
0.94
Activations Density 0.117%