INDEX
Negative Logits
Sotheby
0.52
audiobook
0.51
Scottsdale
0.48
Finley
0.48
Trudeau
0.46
Walter
0.46
Wür
0.45
جوړونکي
0.44
Smithsonian
0.44
Rogers
0.44
POSITIVE LOGITS
trains
0.47
said
0.42
_
0.42
Trains
0.42
hadrons
0.40
asserts
0.39
ày
0.39
And
0.38
Than
0.38
valves
0.38
Activations Density 0.005%