INDEX
Explanations
terms related to potential choices or substitutes
mentions of alternatives in various contexts
New Auto-Interp
Negative Logits
awar
-0.88
ric
-0.80
gran
-0.72
Saud
-0.71
haw
-0.71
Que
-0.70
cer
-0.70
bra
-0.70
Charge
-0.69
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.68
POSITIVE LOGITS
alternatives
1.51
alternative
1.16
ensical
1.01
options
0.95
é¾įå¥ij士
0.95
atives
0.93
replacements
0.90
Altern
0.88
solutions
0.88
itutes
0.87
Activations Density 0.005%