INDEX
Explanations
phrases indicating a resolution or conclusion, with a focus on the concept of determination and inevitability
inquiries or phrases indicating questions about conditions or situations
New Auto-Interp
Negative Logits
ãĥį
-0.83
Uriel
-0.76
ãĥĨ
-0.75
ãĥ¯
-0.75
ãĤ¿
-0.74
idel
-0.74
lyak
-0.73
Afee
-0.70
alk
-0.70
aeda
-0.69
POSITIVE LOGITS
whatsoever
0.84
slightest
0.74
jurisdiction
0.72
severity
0.70
depended
0.70
thickness
0.68
bothers
0.68
encount
0.68
anymore
0.67
challeng
0.67
Activations Density 0.046%