INDEX
Explanations
phrases expressing equality, fairness, or regardless of a specific condition
phrases emphasizing unconditionality and inclusivity
New Auto-Interp
Negative Logits
itia
-0.82
oshenko
-0.75
²
-0.71
ãĤ¼ãĤ¦ãĤ¹
-0.71
efficients
-0.70
©¶æ¥µ
-0.70
jan
-0.70
atis
-0.69
raq
-0.69
ldon
-0.68
POSITIVE LOGITS
whether
1.07
circumstance
0.90
affiliation
0.88
whatsoever
0.86
geography
0.86
differing
0.85
nationality
0.85
severity
0.80
circumstances
0.80
specifics
0.80
Activations Density 0.048%