INDEX
Explanations
mentions of India or things related to India
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1271
+0.15
0.6%
1323
+0.13
0.5%
1035
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1271
+0.15
0.04
1035
+0.13
0.04
650
+0.12
0.03
Negative Logits
krishna
-0.76
muna
-0.76
purtroppo
-0.71
mandal
-0.68
karna
-0.66
nemmeno
-0.66
kumar
-0.65
maha
-0.65
chiaramente
-0.64
jaya
-0.63
POSITIVE LOGITS
Indian
1.17
India
1.16
India
1.14
Indian
1.13
Indians
1.09
Gorb
1.08
Schrö
1.03
Abbé
1.02
Indians
1.00
indian
0.99
Activations Density 0.118%