INDEX
Explanations
names of people or places
proper nouns and character names in media contexts
New Auto-Interp
Negative Logits
Adin
-0.88
®
-0.78
ivan
-0.75
ohn
-0.71
obal
-0.70
orial
-0.69
VID
-0.69
eatures
-0.69
VERT
-0.68
horm
-0.67
POSITIVE LOGITS
indefinitely
0.70
overhead
0.67
following
0.66
operations
0.64
reverber
0.63
expansion
0.62
tomorrow
0.61
fiasco
0.61
ahead
0.61
amidst
0.60
Activations Density 0.639%