INDEX
Explanations
personal relationships and interactions
references to personal connections and relationships
New Auto-Interp
Negative Logits
ahime
-0.88
respectively
-0.80
¿½
-0.70
];
-0.64
rame
-0.64
uers
-0.64
hement
-0.63
avering
-0.61
osures
-0.61
srfAttach
-0.61
POSITIVE LOGITS
knows
1.53
deserves
1.32
except
1.15
hates
1.15
qualifies
1.14
MUST
1.12
owes
1.10
ought
1.08
whatsoever
1.07
understands
1.07
Activations Density 0.407%