INDEX
Explanations
phrases related to revising or negating previous statements or positions
New Auto-Interp
Negative Logits
entric
-0.86
risome
-0.70
uyomi
-0.68
viz
-0.61
inational
-0.60
ities
-0.60
orp
-0.60
uria
-0.60
terday
-0.59
æ©Ł
-0.59
POSITIVE LOGITS
track
1.23
stab
1.13
dated
1.07
packs
1.05
ped
1.01
loaded
0.98
tracking
0.97
side
0.97
strap
0.94
pack
0.93
Activations Density 0.015%