INDEX
Explanations
friends, colleagues, and companions
New Auto-Interp
Negative Logits
늄
0.44
мальчика
0.44
وزار
0.39
menino
0.38
OUR
0.38
তিনটি
0.37
suas
0.37
nasze
0.36
沽
0.36
꿋
0.36
POSITIVE LOGITS
friends
2.00
colleagues
1.97
peers
1.65
朋友
1.63
เพื่อน
1.60
colleague
1.56
friend
1.55
friends
1.55
companions
1.55
compañeros
1.54
Activations Density 0.121%