INDEX
Explanations
phrases related to questioning or challenging actions or decisions
New Auto-Interp
Negative Logits
flow
-0.75
wildlife
-0.74
daily
-0.73
clin
-0.73
retreat
-0.70
millenn
-0.70
intensive
-0.68
lodge
-0.68
Mahjong
-0.67
outpatient
-0.67
POSITIVE LOGITS
although
1.82
especially
1.80
albeit
1.70
except
1.70
though
1.69
including
1.69
unless
1.68
which
1.66
particularly
1.64
perhaps
1.62
Activations Density 0.141%