INDEX
Explanations
the word "seemingly" appearing in the text
the word "seemingly" and expressions indicating a perception versus reality distinction
New Auto-Interp
Negative Logits
rike
-0.76
rikes
-0.73
imeter
-0.70
orters
-0.68
"}],"
-0.67
ppers
-0.67
llers
-0.67
Stud
-0.66
elsen
-0.66
iths
-0.64
POSITIVE LOGITS
Buyable
0.95
innocuous
0.94
icably
0.87
©¶æ¥µ
0.80
metic
0.78
contrad
0.78
unint
0.78
çľ
0.77
discrep
0.77
\\\\\\\\
0.76
Activations Density 0.004%