INDEX
Explanations
mentions of "ji" groups, particularly related to extremist activities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1961
+0.17
0.7%
1276
+0.15
0.6%
144
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
144
+0.17
0.02
1637
+0.15
0.02
1276
+0.13
0.02
Negative Logits
ValueGeneration
-0.56
Дереккөздер
-0.56
د
-0.54
Após
-0.54
ROGEN
-0.53
TargetException
-0.53
CloseOperation
-0.53
DrawerToggle
-0.52
هذا
-0.52
Oregon
-0.52
POSITIVE LOGITS
increa
1.61
embra
1.59
emphat
1.58
suscep
1.57
pessi
1.54
maneu
1.51
inev
1.49
reluct
1.49
strick
1.48
intermitt
1.48
Activations Density 0.142%