INDEX
Explanations
classified as inferior or objects
New Auto-Interp
Negative Logits
琯
0.43
Quir
0.41
ప్రతి
0.40
disapproval
0.40
conventionally
0.39
overshadow
0.39
Richter
0.38
0.37
丕
0.37
传统的
0.37
POSITIVE LOGITS
inferior
0.93
worthless
0.88
inferiores
0.88
inférieurs
0.83
verm
0.82
inférieure
0.80
infer
0.79
inferiore
0.77
inférieures
0.77
inférieur
0.75
Activations Density 0.025%