INDEX
Explanations
plural noun phrases indicating groups of people
references to different groups of people and their characteristics or actions
New Auto-Interp
Negative Logits
ĺ
-0.75
friend
-0.70
gy
-0.69
thood
-0.64
ĵ
-0.63
ķ
-0.60
gyn
-0.60
busters
-0.59
gil
-0.59
ison
-0.58
POSITIVE LOGITS
lurking
0.76
somew
0.68
abouts
0.68
hiding
0.68
overlap
0.68
circulating
0.65
mismatch
0.65
othal
0.65
waiting
0.64
ripple
0.63
Activations Density 0.355%