INDEX
Explanations
names related to military figures or characters
names of places and characters associated with a particular narrative or context
New Auto-Interp
Negative Logits
nect
-0.90
peat
-0.72
overboard
-0.72
washing
-0.70
ciples
-0.68
figure
-0.65
joice
-0.64
issance
-0.64
natureconservancy
-0.64
merce
-0.64
POSITIVE LOGITS
inia
1.05
inus
0.90
a
0.84
inian
0.80
iem
0.80
ij士
0.79
andr
0.79
pheus
0.77
ername
0.76
mund
0.74
Activations Density 0.140%