INDEX
Negative Logits
thrift
0.47
vyb
0.44
esian
0.43
sensitized
0.43
渴
0.43
敛
0.41
thirsty
0.41
iciency
0.39
decayed
0.38
Loew
0.38
POSITIVE LOGITS
tokenizer
0.55
Tokenizer
0.48
Trainer
0.45
token
0.42
trainer
0.41
ArgumentParser
0.41
Trainer
0.41
热
0.41
trainer
0.40
newToken
0.40
Activations Density 0.004%