INDEX
Explanations
references to familial relationships and emotional connections
New Auto-Interp
Negative Logits
itsu
-0.15
Záp
-0.15
URITY
-0.15
γμα
-0.15
appa
-0.15
jist
-0.14
EXEMPLARY
-0.14
deniz
-0.14
EDI
-0.14
Äija
-0.14
POSITIVE LOGITS
loved
0.73
Loved
0.60
relatives
0.54
family
0.52
relative
0.46
friends
0.43
Relatives
0.42
family
0.42
relative
0.41
Relative
0.38
Activations Density 0.247%