INDEX
Explanations
terms related to economic and financial matters
terms related to demographics and social categories
New Auto-Interp
Negative Logits
emb
-0.69
inward
-0.58
Archdemon
-0.58
antit
-0.57
borne
-0.56
Merit
-0.56
brim
-0.56
stride
-0.54
Nap
-0.54
upfront
-0.54
POSITIVE LOGITS
theless
1.30
ukong
0.87
tenance
0.86
ciating
0.84
xual
0.84
ifix
0.81
vous
0.79
urities
0.77
asking
0.74
sers
0.72
Activations Density 0.143%