INDEX
Explanations
words related to people and their actions or roles in society
New Auto-Interp
Negative Logits
©¶æ
-0.78
isoft
-0.69
referen
-0.64
Priv
-0.64
ASED
-0.63
subp
-0.62
tandem
-0.60
insensitive
-0.60
Expend
-0.59
divid
-0.58
POSITIVE LOGITS
swick
1.24
shire
1.02
worth
1.01
beck
0.96
inki
0.92
adesh
0.90
fleet
0.89
eston
0.86
iland
0.86
field
0.85
Activations Density 0.061%