INDEX
Explanations
phrases related to personal possessions or activities
possessive pronouns indicating ownership
New Auto-Interp
Negative Logits
ylum
-0.89
Statement
-0.78
vernment
-0.77
ablishment
-0.76
Dialogue
-0.75
aneers
-0.75
Report
-0.74
Crime
-0.74
女
-0.73
pects
-0.72
POSITIVE LOGITS
favorite
1.27
own
1.23
favourite
1.23
grandma
1.07
girlfriend
1.07
grandmother
1.03
buddy
1.01
buddies
1.00
friend
1.00
dad
0.98
Activations Density 0.174%