INDEX
Explanations
starting new sentences with phrases
New Auto-Interp
Negative Logits
ach
0.59
onic
0.49
रोटी
0.49
itone
0.49
undo
0.47
ack
0.47
name
0.47
og
0.47
und
0.45
ne
0.45
POSITIVE LOGITS
fourn
0.51
ہاکی
0.50
posiada
0.49
спорттук
0.49
shew
0.48
compétition
0.48
秶
0.47
baisse
0.47
⭐
0.46
budú
0.46
Activations Density 0.001%