INDEX
Explanations
phrases involving comparing different things
instances of the word "with" indicating comparisons or relationships
New Auto-Interp
Negative Logits
alone
-0.68
zin
-0.66
icious
-0.65
press
-0.65
sylvania
-0.65
entimes
-0.65
hops
-0.63
fighter
-0.63
BUG
-0.63
chief
-0.62
POSITIVE LOGITS
regards
1.14
regard
1.09
stood
0.97
drawn
0.94
impunity
0.87
draw
0.86
respect
0.86
holding
0.71
holders
0.68
trl
0.66
Activations Density 0.132%