INDEX
Explanations
friend and friends in multiple languages
New Auto-Interp
Negative Logits
INTERACTIONS
0.41
nrows
0.37
oys
0.36
(*)
0.36
bounded
0.35
Interactions
0.34
(*)
0.34
ruk
0.34
pareil
0.33
erros
0.33
POSITIVE LOGITS
friend
2.58
friends
2.47
친구
2.34
朋友
2.25
友人
2.20
เพื่อน
2.17
friends
2.05
친구
2.02
vriend
1.99
صدي
1.98
Activations Density 0.023%