INDEX
Explanations
phrases related to extracting information or data from text or a webpage
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1328
+0.16
0.6%
849
+0.14
0.5%
214
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
849
+0.16
0.03
1328
+0.14
0.03
1416
+0.13
0.02
Negative Logits
Usar
-0.55
Izvori
-0.54
Dijo
-0.54
Deber
-0.52
Comprar
-0.50
inaugura
-0.49
Continuar
-0.48
compréhen
-0.48
Viene
-0.48
Quais
-0.48
POSITIVE LOGITS
extract
1.22
extraction
1.22
extract
1.21
EXTRACT
1.16
extracts
1.13
extraction
1.13
Extraction
1.12
extracting
1.12
extractor
1.11
extracted
1.09
Activations Density 0.110%