INDEX
Explanations
phrases with the word "under" followed by a specific entity or concept
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
994
+0.09
0.3%
732
+0.09
0.3%
1110
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1793
+0.09
0.03
562
+0.09
0.03
1110
+0.09
0.02
Negative Logits
jComboBox
-0.53
ยม
-0.46
waitKey
-0.45
})+\
-0.45
efeated
-0.44
')
-0.44
готовка
-0.44
();)
-0.44
DeleteMapping
-0.44
ActionResult
-0.43
POSITIVE LOGITS
hcm
1.01
vola
1.01
hunde
1.00
inder
0.97
ideolog
0.95
Singapur
0.95
utop
0.95
migra
0.93
solidar
0.92
alkoh
0.91
Activations Density 0.126%