INDEX
Explanations
conversational phrases with an informal tone
expressions of personal experiences and sentiments
New Auto-Interp
Negative Logits
etheless
-0.81
uitive
-0.77
cture
-0.76
prisingly
-0.73
surprisingly
-0.69
:=
-0.67
----------------------------------------------------------------
-0.66
minist
-0.66
inction
-0.65
largeDownload
-0.60
POSITIVE LOGITS
.")
1.44
'."
1.28
").
1.24
.'"
1.23
'"
1.21
!'"
1.20
',"
1.10
"]
1.08
,'"
1.07
")
1.05
Activations Density 0.322%