INDEX
Explanations
defining or interpreting concepts
New Auto-Interp
Negative Logits
డబ్బు
0.39
কাপড়
0.37
деньги
0.36
بيض
0.36
র্তি
0.36
basura
0.36
shamp
0.36
ﺅ
0.36
শুকনো
0.36
पैसा
0.35
POSITIVE LOGITS
transversal
0.47
coer
0.46
perempt
0.46
declinations
0.45
outlining
0.43
coherent
0.43
concret
0.41
transverse
0.41
transvers
0.41
interpret
0.40
Activations Density 0.003%