INDEX
Explanations
old or traditional concepts
New Auto-Interp
Negative Logits
enumerate
0.43
↵
0.41
bounded
0.40
으로
0.39
="
0.39
"""
0.39
Erika
0.38
\
0.38
`${0.38
lunghezza
0.37
POSITIVE LOGITS
old
1.18
fashioned
1.05
fashioned
1.04
旧
0.98
पुराने
0.98
old
0.95
vecchio
0.94
viejo
0.93
舊
0.93
OLD
0.93
Activations Density 0.026%