INDEX
Explanations
personal information related to individuals
possessive forms that indicate ownership or relationships
New Auto-Interp
Negative Logits
pty
-0.83
yrs
-0.79
Canaver
-0.76
chers
-0.76
mir
-0.76
quished
-0.75
icably
-0.72
ij士
-0.70
agles
-0.70
TAIN
-0.69
POSITIVE LOGITS
own
1.42
surroundings
1.20
identity
1.08
genitals
1.06
behavior
1.03
preferences
1.03
favourite
1.02
personality
1.01
favorite
1.01
behaviour
1.01
Activations Density 0.275%