INDEX
Explanations
phrases beginning with verbs in present tense
phrases that indicate assertion or claims
New Auto-Interp
Negative Logits
soDeliveryDate
-0.71
hovah
-0.58
Photograph
-0.55
brance
-0.55
wearer
-0.55
ousands
-0.54
Exper
-0.54
pilgr
-0.53
immune
-0.53
civilisation
-0.53
POSITIVE LOGITS
sarcast
1.00
rhet
0.81
apologizing
0.75
:]
0.74
emphatically
0.73
remarks
0.72
clar
0.68
bluntly
0.68
.):
0.67
apologized
0.67
Activations Density 0.824%