INDEX
Explanations
words related to marital status, specifically identifying individuals who are married
instances of the word "married."
New Auto-Interp
Negative Logits
urg
-0.78
abwe
-0.78
Flavoring
-0.77
ostic
-0.74
ebin
-0.70
affer
-0.69
efer
-0.69
acco
-0.68
umbn
-0.67
osta
-0.66
POSITIVE LOGITS
nesday
1.01
couples
0.84
ton
0.78
married
0.76
tons
0.76
equality
0.75
marry
0.75
equality
0.74
thood
0.73
women
0.73
Activations Density 0.025%