INDEX
Explanations
used to describe functionality
New Auto-Interp
Negative Logits
5
0.66
4
0.63
Use
0.63
7
0.61
multipurpose
0.57
käyttö
0.56
използ
0.56
वापरा
0.55
overuse
0.55
usable
0.54
POSITIVE LOGITS
ttt
0.55
ILIA
0.54
ت
0.53
ع
0.53
determining
0.53
தி
0.49
ет
0.49
ον
0.48
ܛ
0.48
ห
0.47
Activations Density 0.050%