INDEX
Explanations
pronouns referring to groups of people or entities
references to specific individuals or entities with high influence or significance
New Auto-Interp
Negative Logits
interstitial
-0.57
verted
-0.56
Afterwards
-0.55
ikers
-0.54
Insp
-0.54
ãĥ¬
-0.54
uted
-0.53
ixt
-0.52
wordpress
-0.52
Wikipedia
-0.52
POSITIVE LOGITS
needs
1.18
already
1.11
wants
1.08
could
1.08
knows
1.06
might
1.06
prefers
1.03
intends
1.02
may
1.02
lacks
1.01
Activations Density 0.521%