INDEX
Explanations
attends to prepositions from possessive pronouns and quantified expressions
New Auto-Interp
Head Attr Weights
0:0.08
1:0.17
2:0.10
3:0.06
4:0.04
5:0.03
6:0.07
7:0.41
Negative Logits
ControllerBase
-0.29
astify
-0.27
Gottlieb
-0.27
PARATUS
-0.26
urtles
-0.26
Kram
-0.25
ouge
-0.25
EconPapers
-0.25
Davison
-0.25
sec
-0.25
POSITIVE LOGITS
RegressionTest
0.45
Vidite
0.35
AccessorTable
0.35
>=",
0.34
SharedDtor
0.32
RenderAtEndOf
0.31
:✨
0.30
EditorBrowsable
0.30
EndGlobalSection
0.28
perdita
0.28
Activations Density 0.001%