INDEX
Explanations
names of celebrities or characters being portrayed as other celebrities or characters
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.20
0.6%
184
+0.15
0.5%
1177
+0.14
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.20
0.03
1177
+0.15
0.02
1984
+0.14
0.03
Negative Logits
<bos>
-0.91
Paglinawan
-0.72
kasarigan
-0.68
GEBURTSDATUM
-0.68
nogen
-0.62
ország
-0.61
تقاوى
-0.61
about
-0.58
Становништво
-0.57
styleType
-0.57
POSITIVE LOGITS
exé
1.63
prouve
1.53
rafra
1.45
dépasse
1.44
trouvera
1.43
prétend
1.42
renfer
1.41
pixabay
1.40
prendra
1.39
représ
1.38
Activations Density 0.102%