INDEX
Explanations
explaining effort and hard work
New Auto-Interp
Negative Logits
Velocity
0.40
Th
0.39
Th
0.38
formaldehyde
0.38
টিক্কা
0.38
চলমান
0.38
সংখ্য
0.37
সার্জ
0.37
mot
0.36
surgical
0.36
POSITIVE LOGITS
harder
0.86
努力
0.84
effort
0.82
मेहनत
0.77
辛苦
0.75
hard
0.73
laborious
0.73
esfor
0.71
노력
0.71
ardu
0.71
Activations Density 0.014%