INDEX
Explanations
phrases indicating personal opinions or preferences
emphasized statements and expressions of personal opinion
New Auto-Interp
Negative Logits
»Ĵ
-0.89
WIND
-0.73
interstitial
-0.73
æĸ
-0.69
IRT
-0.69
çĭ
-0.68
ĸļ
-0.68
ANC
-0.66
ļéĨĴ
-0.66
ãĥĺ
-0.66
POSITIVE LOGITS
but
1.30
But
1.30
but
1.29
But
1.09
BUT
1.09
However
1.07
though
0.96
However
0.93
although
0.92
Nonetheless
0.89
Activations Density 0.757%