INDEX
Explanations
phrases involving subtle hints or traces of something
phrases indicating subtle or implied qualities
New Auto-Interp
Negative Logits
Merit
-0.88
Reviewer
-0.78
imens
-0.76
erto
-0.75
largest
-0.74
rer
-0.73
WATCHED
-0.73
riers
-0.72
ļéĨĴ
-0.71
>>\
-0.71
POSITIVE LOGITS
humor
1.07
humour
1.01
irony
0.98
sweetness
0.96
cynicism
0.95
realism
0.93
brilliance
0.93
sanity
0.91
bitterness
0.90
sadness
0.90
Activations Density 0.205%