INDEX
Explanations
URLs with specific intermediate strings
tokens that are parts of web addresses, email/domain names, or filenames (e.g., ".com" and site/filename fragments).
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
764
+0.41
1.4%
856
+0.21
0.7%
184
+0.21
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
764
+0.41
0.01
137
+0.21
0.01
1784
+0.21
0.01
Negative Logits
ideolog
-0.83
notor
-0.77
pecuni
-0.76
solidar
-0.73
indeb
-0.70
ristor
-0.70
utop
-0.67
incess
-0.66
lusso
-0.65
márm
-0.64
POSITIVE LOGITS
IsContent
0.73
shenan
0.59
unspeak
0.58
exasper
0.55
ngl
0.54
ineffec
0.53
impelled
0.52
.*")]
0.52
endeav
0.52
laboring
0.51
Activations Density 0.016%