INDEX
Explanations
easy to understand or implement
New Auto-Interp
Negative Logits
Z
0.82
सी
0.78
М
0.77
Մ
0.77
'
0.76
ির
0.74
Μ
0.73
Ι
0.73
U
0.71
’
0.70
POSITIVE LOGITS
-
0.84
易
0.71
dàng
0.69
fácil
0.69
hitta
0.63
fáciles
0.62
breezy
0.62
롭게
0.61
hopping
0.59
revolt
0.58
Activations Density 0.098%