INDEX
Explanations
important keywords related to political and military affairs
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
227
+0.09
0.3%
581
+0.08
0.2%
674
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
594
+0.09
0.06
227
+0.08
0.07
324
+0.08
0.04
Negative Logits
nakalista
-0.61
enderror
-0.58
utop
-0.57
frans
-0.57
)_/¯
-0.57
""],
-0.56
Ufer
-0.56
solidar
-0.56
IVEREF
-0.56
DebuggerNonUser
-0.55
POSITIVE LOGITS
hairc
1.89
cushi
1.70
impra
1.63
disreg
1.60
affor
1.56
scrat
1.55
milf
1.53
disagre
1.53
reluct
1.47
hentai
1.46
Activations Density 0.568%