INDEX
Explanations
words related to technology or computer programming, specifically focusing on object-oriented programming concepts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
507
+0.12
0.4%
1343
+0.11
0.3%
1967
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
507
+0.12
0.04
523
+0.11
0.03
956
+0.10
0.03
Negative Logits
vainly
-0.71
unspeak
-0.63
unavoid
-0.60
testifies
-0.56
OBSERVATIONS
-0.55
supposes
-0.55
unbear
-0.54
indescri
-0.54
disambiguazione
-0.53
frowns
-0.53
POSITIVE LOGITS
sappi
0.91
venuto
0.79
riuscito
0.74
vogli
0.71
parlando
0.70
Ottobre
0.68
arrivato
0.67
bbene
0.67
persino
0.67
voleva
0.67
Activations Density 0.244%