INDEX
Explanations
phrases related to excerpts, screenshots, and reports
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.11
0.4%
690
+0.11
0.3%
122
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1044
+0.11
0.03
1467
+0.11
0.02
1305
+0.10
0.02
Negative Logits
shutterstock
-0.90
ordina
-0.87
pixabay
-0.87
Chá
-0.85
ⓧ
-0.83
Perci
-0.80
Nö
-0.79
onor
-0.79
nomine
-0.79
intit
-0.78
POSITIVE LOGITS
snippets
0.78
screenshot
0.76
snippet
0.74
screenshots
0.71
excerpts
0.69
of
0.66
excerpt
0.66
cerpts
0.66
snapshot
0.64
snapshots
0.62
Activations Density 0.162%