INDEX
Explanations
language understanding capabilities
New Auto-Interp
Negative Logits
ಡುವುದ
0.41
נ
0.40
אחד
0.40
ܘܢ
0.39
fermeture
0.38
蹌
0.38
veggies
0.37
debilit
0.37
pport
0.37
einigen
0.37
POSITIVE LOGITS
Analysis
0.46
Handbook
0.44
and
0.42
и
0.41
beliefs
0.40
罫
0.40
routinely
0.40
www
0.39
Computation
0.39
language
0.39
Activations Density 0.000%