INDEX
Explanations
family relationships or family-related terms, particularly focusing on parent-child relationships
references to individuals involved in incidents, particularly in a legal or distressing context
New Auto-Interp
Negative Logits
izoph
-0.72
raltar
-0.66
peed
-0.65
ichick
-0.64
åĮ
-0.62
bes
-0.62
pell
-0.61
orting
-0.60
spons
-0.60
OME
-0.60
POSITIVE LOGITS
's
0.98
fled
0.89
suffered
0.88
surn
0.88
died
0.87
belonged
0.86
was
0.85
testified
0.85
awoke
0.82
withdrew
0.82
Activations Density 0.146%