INDEX
Explanations
words related to actions of suppression or restriction
words or phrases related to questioning or uncertainty
New Auto-Interp
Negative Logits
Prosper
-0.66
strap
-0.65
hist
-0.65
permitting
-0.64
stun
-0.62
breeze
-0.61
disappoint
-0.61
roy
-0.61
overhead
-0.60
TODAY
-0.60
POSITIVE LOGITS
ibble
1.32
irk
1.30
arre
1.30
ivering
1.27
iets
1.18
arant
1.17
orum
1.17
itter
1.17
ilt
1.15
ilts
1.12
Activations Density 0.015%