INDEX
Explanations
references to family relationships and familial roles
New Auto-Interp
Negative Logits
aji
-0.76
uca
-0.75
inen
-0.72
ourning
-0.67
oji
-0.65
arus
-0.65
iolet
-0.65
tics
-0.64
ardo
-0.63
someone
-0.62
POSITIVE LOGITS
hesis
1.21
heses
1.11
hetical
0.94
hood
0.85
entity
0.80
NetMessage
0.78
parts
0.77
ship
0.76
borough
0.75
sonian
0.75
Activations Density 0.033%