INDEX
Explanations
proper nouns and names of individuals
references to specific individuals, particularly names
New Auto-Interp
Negative Logits
SHARE
-0.62
SIZE
-0.60
INST
-0.57
cause
-0.54
PUBLIC
-0.53
IUM
-0.52
FAT
-0.52
OFFIC
-0.51
ŃĶ
-0.51
SHARES
-0.51
POSITIVE LOGITS
quart
0.88
illard
0.82
arre
0.80
pai
0.80
elaide
0.75
enburg
0.73
insula
0.73
ey
0.71
eyes
0.71
iets
0.71
Activations Density 0.117%