INDEX
Negative Logits
anes
0.43
Math
0.42
Checking
0.41
rab
0.41
使用
0.40
Linda
0.40
čení
0.40
math
0.39
Susan
0.39
matemat
0.39
POSITIVE LOGITS
ambiguity
0.65
ambiguous
0.63
disamb
0.63
disamb
0.55
ambiguities
0.55
ambigu
0.52
暧
0.45
unambiguous
0.44
unambiguously
0.44
अंब
0.43
Activations Density 0.001%