INDEX
Explanations
pronouns and possessive determiners referring to unspecified entities
references to the concept of 'its' in various contexts
New Auto-Interp
Negative Logits
tsy
-0.84
tch
-0.84
Gun
-0.74
roup
-0.72
amiya
-0.70
bys
-0.68
knife
-0.66
poke
-0.65
uls
-0.65
VS
-0.64
POSITIVE LOGITS
contents
1.27
usefulness
1.16
importance
1.16
significance
1.15
origins
1.13
effects
1.12
implications
1.09
creator
1.08
predecessor
1.08
impact
1.08
Activations Density 0.204%