INDEX
Explanations
matching and choosing best options
New Auto-Interp
Negative Logits
peregr
0.84
wonder
0.74
tyranny
0.72
generation
0.71
abhavam
0.70
tyrannical
0.69
light
0.69
deprive
0.68
generate
0.68
anxiously
0.68
POSITIVE LOGITS
Matching
1.89
matching
1.77
matches
1.63
Matching
1.62
matching
1.53
matched
1.51
Match
1.44
match
1.41
Matches
1.38
Match
1.38
Activations Density 0.000%