INDEX
Explanations
rankings or numerical ratings of entities
ranking or numbers
New Auto-Interp
Negative Logits
ยนต์
-0.38
and
-0.33
,
-0.32
murni
-0.32
wnież
-0.28
ฟัง
-0.28
and
-0.28
einf
-0.28
quizás
-0.27
combined
-0.26
POSITIVE LOGITS
ConstraintMaker
1.02
Nummer
0.82
№
0.81
houſe
0.80
NUMBER
0.79
№
0.76
queſta
0.75
nummer
0.73
number
0.73
RANK
0.73
Activations Density 0.015%