INDEX
Explanations
specific names or titles, particularly those that seem to represent people or characters in various contexts
New Auto-Interp
Negative Logits
_DIST
-0.15
egot
-0.14
ximity
-0.14
wood
-0.13
ioc
-0.13
aney
-0.13
dal
-0.13
virt
-0.13
egov
-0.13
渡
-0.13
POSITIVE LOGITS
ertest
0.17
isia
0.16
=pk
0.15
*)((
0.15
nown
0.15
orks
0.14
íݸ
0.14
orgot
0.14
ool
0.13
ylon
0.13
Activations Density 0.503%