INDEX
Explanations
phrases related to historical developments and lifestyles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.11
0.3%
1496
+0.10
0.3%
257
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1496
+0.11
0.04
1984
+0.10
0.05
1336
+0.08
0.04
Negative Logits
raught
-0.52
ophenyl
-0.47
cences
-0.46
abancı
-0.45
advies
-0.44
astrous
-0.43
ophen
-0.42
conseguiu
-0.42
Dissolved
-0.41
szuka
-0.41
POSITIVE LOGITS
polig
0.94
kask
0.90
priva
0.90
utop
0.86
Singapur
0.83
traktor
0.80
anonim
0.79
fasc
0.78
teras
0.77
elek
0.76
Activations Density 0.426%