INDEX
Explanations
words related to invitations or involvement in events
New Auto-Interp
Negative Logits
ellen
-0.16
eron
-0.15
ICC
-0.15
زة
-0.15
eness
-0.15
ums
-0.15
724
-0.15
wy
-0.15
peration
-0.14
jsp
-0.14
POSITIVE LOGITS
inv
0.34
Inv
0.28
incible
0.26
inv
0.26
(inv
0.26
-inv
0.25
.inv
0.24
TargetException
0.23
olved
0.22
Inv
0.22
Activations Density 0.016%