INDEX
Explanations
causing, routing, comparable
New Auto-Interp
Negative Logits
+
0.44
نی
0.42
INA
0.41
Toe
0.41
voda
0.40
Sov
0.39
Baba
0.39
ENDO
0.39
프
0.39
STEM
0.39
POSITIVE LOGITS
comparable
0.51
an
0.49
类似于
0.47
similar
0.46
useful
0.45
Comparable
0.45
analogous
0.45
ণিজ্য
0.44
と同じ
0.44
campaigning
0.43
Activations Density 0.000%