INDEX
Explanations
references to role-playing games (RPGs) and character preferences in gaming
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
964
+0.11
0.3%
1013
+0.11
0.3%
674
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2044
+0.11
0.08
1317
+0.11
0.06
62
+0.10
0.04
Negative Logits
Khart
-1.67
dises
-1.64
Juf
-1.63
Keny
-1.59
sovere
-1.57
fta
-1.57
Augu
-1.53
Confu
-1.51
volunte
-1.49
thut
-1.49
POSITIVE LOGITS
<bos>
1.11
my
1.00
myself
0.84
me
0.73
meine
0.72
mijn
0.71
I
0.67
domada
0.66
我的
0.66
lately
0.65
Activations Density 0.963%