INDEX
Explanations
verbs or phrases related to personal beliefs and actions
negations or expressions of doubt and uncertainty
New Auto-Interp
Negative Logits
ð
-0.73
SPONSORED
-0.72
ire
-0.63
bush
-0.62
ê
-0.62
tains
-0.61
donald
-0.61
alist
-0.61
"[
-0.61
Hur
-0.60
POSITIVE LOGITS
however
0.94
ital
0.76
therefore
0.75
moreover
0.74
meanwhile
0.69
furthermore
0.69
though
0.66
certainly
0.64
defin
0.63
ĸļ
0.62
Activations Density 1.043%