INDEX
Explanations
references originating or directing information to specific individuals
New Auto-Interp
Negative Logits
iencies
-0.85
estyles
-0.79
merce
-0.75
few
-0.74
lied
-0.74
ikes
-0.73
selection
-0.73
pioneered
-0.73
wcsstore
-0.71
ãĤ¦ãĤ¹
-0.71
POSITIVE LOGITS
superiors
0.88
afar
0.84
sender
0.82
whence
0.81
abroad
0.76
heaven
0.75
Cyrus
0.73
Burk
0.67
Papa
0.67
inside
0.66
Activations Density 0.089%