INDEX
Explanations
instances of the word "flip" and its variations
New Auto-Interp
Negative Logits
Barth
-0.69
erdan
-0.66
Brandt
-0.63
tratti
-0.63
carottes
-0.62
MLLoader
-0.62
awtextra
-0.62
Cordero
-0.60
nubes
-0.60
ňov
-0.59
POSITIVE LOGITS
flip
1.80
flips
1.73
flipping
1.57
flip
1.53
flipped
1.53
Flip
1.52
Flip
1.44
Fli
1.41
Fli
1.32
FLIP
1.21
Activations Density 0.007%