INDEX
Explanations
pronouns or phrases referring to unknown identities or subjects
references to individuals or entities
New Auto-Interp
Negative Logits
ļéĨĴ
-0.64
eme
-0.64
framework
-0.63
absence
-0.62
GV
-0.60
strip
-0.60
caution
-0.59
ranging
-0.59
saturation
-0.58
earchers
-0.58
POSITIVE LOGITS
owns
1.32
else
1.09
cares
1.03
knows
1.00
pays
0.98
occupies
0.98
oping
0.97
participates
0.95
deserves
0.93
governs
0.92
Activations Density 0.048%