INDEX
Explanations
mentions of social or legal constructs and regulations
words indicating relationship statuses and regulatory concepts
New Auto-Interp
Negative Logits
adan
-0.91
utra
-0.87
osate
-0.81
adra
-0.80
achu
-0.76
ocene
-0.76
eus
-0.74
yip
-0.73
Tycoon
-0.72
è£ħ
-0.71
POSITIVE LOGITS
ones
0.83
paren
0.82
eals
0.72
versions
0.72
persons
0.71
ness
0.71
goods
0.69
objects
0.69
foreign
0.68
lung
0.66
Activations Density 0.267%