INDEX
Explanations
instances where a person or entity is being referred to as occupying a specific position or role
New Auto-Interp
Negative Logits
haps
-0.70
Fine
-0.62
participants
-0.59
members
-0.59
ERAL
-0.58
theless
-0.58
USS
-0.58
nces
-0.58
OSH
-0.58
then
-0.57
POSITIVE LOGITS
whom
0.93
ape
0.80
swer
0.77
atown
0.75
sunglasses
0.69
cr
0.64
nikov
0.63
coat
0.63
ierre
0.63
planet
0.60
Activations Density 0.174%