INDEX
Explanations
punctuation followed by explanations
New Auto-Interp
Negative Logits
stress
0.90
hesitate
0.84
frustration
0.82
doubt
0.78
rejoice
0.78
hubby
0.76
stressors
0.75
болезнь
0.75
estrés
0.75
াকাল
0.75
POSITIVE LOGITS
Which
1.40
And
1.36
Including
1.30
That
1.17
Which
1.16
Namely
1.12
Specifically
1.12
And
1.12
Because
1.07
Not
1.05
Activations Density 0.147%