INDEX
Explanations
possessives with associated entities
New Auto-Interp
Negative Logits
untansi
0.43
शख्स
0.38
sentation
0.37
আন
0.36
특징
0.35
にあった
0.35
iería
0.35
男子
0.35
目录
0.34
umina
0.34
POSITIVE LOGITS
counterparts
0.98
partners
0.95
partner
0.91
friends
0.90
colleagues
0.89
companions
0.86
counterpart
0.85
teammates
0.84
neighbors
0.82
comrades
0.80
Activations Density 0.025%