INDEX
Explanations
words related to family relationships
references to familial relationships, particularly those marked by legal or marital terms
New Auto-Interp
Negative Logits
enthusi
-0.70
referen
-0.68
ħĭ
-0.67
penetrated
-0.66
suppress
-0.64
smugglers
-0.63
transl
-0.62
convol
-0.62
incorpor
-0.62
mathemat
-0.61
POSITIVE LOGITS
sight
1.03
death
0.89
speech
0.88
terms
0.86
order
0.85
arms
0.84
dist
0.81
advertising
0.81
chief
0.80
Lago
0.80
Activations Density 0.044%