INDEX
Explanations
phrases containing punctuation marks
quotes and statements that highlight numeric values or statistical information
New Auto-Interp
Negative Logits
boro
-0.77
purse
-0.70
compass
-0.68
lifes
-0.67
favor
-0.66
treasure
-0.64
retreat
-0.64
affili
-0.64
pursuit
-0.64
utical
-0.62
POSITIVE LOGITS
Instead
1.12
His
1.12
However
1.11
Nevertheless
1.06
Meanwhile
1.04
Asked
1.04
He
1.01
Regarding
1.00
Nonetheless
0.98
Speaking
0.97
Activations Density 0.535%