INDEX
Explanations
references to character relationships and dynamics
"Who" followed by a verb
people who like or know things
New Auto-Interp
Negative Logits
pons
-0.54
zostało
-0.53
ksikon
-0.51
Viitteet
-0.51
getRule
-0.51
ณ
-0.50
zostały
-0.47
Secure
-0.46
EndContext
-0.46
stantial
-0.45
POSITIVE LOGITS
prefers
1.39
loves
1.33
prefer
1.28
likes
1.27
preferring
1.22
prefer
1.19
hates
1.19
Prefer
1.17
Prefer
1.15
gusta
1.08
Activations Density 0.355%