INDEX
Explanations
personal anecdotes or stories
expressions of personal experiences and emotions related to family and relationships
New Auto-Interp
Negative Logits
haar
-0.64
issance
-0.59
Dhabi
-0.55
pires
-0.52
uid
-0.51
bris
-0.51
ettel
-0.51
rium
-0.51
ngth
-0.50
oshenko
-0.50
POSITIVE LOGITS
their
1.61
Their
1.49
they
1.49
my
1.44
myself
1.44
They
1.41
Their
1.40
they
1.40
They
1.39
THEIR
1.39
Activations Density 0.775%