INDEX
Explanations
words ending in punctuation
New Auto-Interp
Negative Logits
0.50
Comet
0.49
"
0.46
IDGE
0.45
white
0.45
،
0.44
Social
0.44
talent
0.44
talon
0.43
negation
0.42
POSITIVE LOGITS
ලබා
0.50
牍
0.48
FI
0.48
रखेंगे
0.47
获得的
0.46
Supplementary
0.46
`],
0.45
ተጨማሪ
0.45
ools
0.44
canaux
0.44
Activations Density 0.003%