INDEX
Explanations
adverbs related to drastic or dramatic change
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.14
0.5%
976
+0.11
0.4%
1056
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1056
+0.14
0.05
976
+0.11
0.04
478
+0.09
0.04
Negative Logits
attemp
-0.97
intersper
-0.97
kön
-0.96
abstrait
-0.89
reluct
-0.87
eki
-0.87
inev
-0.86
encomp
-0.85
praktik
-0.85
délib
-0.84
POSITIVE LOGITS
ificantly
0.76
haustible
0.68
pertise
0.67
IALLY
0.64
medies
0.63
greatly
0.62
marginBottom
0.62
ecuted
0.62
autunno
0.59
requently
0.59
Activations Density 0.130%