INDEX
Explanations
significant overall assessments or ratings in the text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.27
1.4%
699
+0.13
0.7%
479
+0.09
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
699
+0.27
0.04
442
+0.13
0.03
479
+0.09
0.03
Negative Logits
<bos>
-2.53
/***
-0.80
<?
-0.71
///**
-0.66
/*!
-0.64
/**
-0.62
//---
-0.61
<!--
-0.57
assistir
-0.57
consult
-0.56
POSITIVE LOGITS
Minang
1.39
perfon
1.29
lidl
1.28
bandung
1.28
chrysler
1.27
stockholm
1.24
Palembang
1.24
Karang
1.22
embodi
1.19
jaya
1.19
Activations Density 0.056%