INDEX
Explanations
the verb "take" in various forms and contexts
New Auto-Interp
Negative Logits
mun
-0.06
rij
-0.06
andon
-0.06
rom
-0.06
ime
-0.06
pus
-0.06
ri
-0.06
/we
-0.06
rika
-0.06
nu
-0.06
POSITIVE LOGITS
advantage
0.14
aways
0.10
uchi
0.08
ÑĥÑĩаÑģÑĤÑĮ
0.08
htag
0.08
risks
0.08
refuge
0.08
responsibility
0.08
pride
0.08
advant
0.07
Activations Density 0.090%