INDEX
Explanations
possessive pronouns and possessive forms indicating ownership or relational context
New Auto-Interp
Negative Logits
ocy
-0.18
yourselves
-0.15
bb
-0.15
mw
-0.14
inspace
-0.14
@(
-0.14
NgModule
-0.14
ools
-0.13
ones
-0.13
ocol
-0.13
POSITIVE LOGITS
aim
0.24
goal
0.23
focus
0.18
presence
0.18
goal
0.18
mere
0.17
缮
0.17
oret
0.16
attempt
0.16
failure
0.15
Activations Density 0.187%