INDEX
Explanations
reciprocal actions between entities
New Auto-Interp
Negative Logits
pediu
0.73
który
0.64
multitudes
0.62
ي
0.60
ejer
0.59
indicó
0.59
কিন্ত
0.58
женой
0.57
swojego
0.56
پدر
0.56
POSITIVE LOGITS
mutually
0.53
in
0.52
彼此
0.50
相互
0.49
Mutual
0.48
ä
0.48
єї
0.46
birbir
0.45
einander
0.45
through
0.44
Activations Density 0.029%