INDEX
Explanations
word tokens related to invitations or extending invites
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
314
+0.13
0.5%
200
+0.13
0.5%
553
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
314
+0.13
0.02
200
+0.13
0.02
492
+0.13
0.03
Negative Logits
ksessa
-0.56
Inoltre
-0.54
ksesta
-0.50
Fase
-0.50
misure
-0.49
guma
-0.49
Semif
-0.48
HasMaxLength
-0.48
teras
-0.48
ricerche
-0.48
POSITIVE LOGITS
Invited
1.27
Invite
1.24
invite
1.23
depic
1.23
invitation
1.21
Invitation
1.20
reluct
1.20
invitations
1.19
invited
1.14
invites
1.13
Activations Density 0.072%