INDEX
Explanations
phrases related to direct speech and thoughts
repeated instances of the word "I" and variations that indicate uncertainty or negative assertions
New Auto-Interp
Negative Logits
Palest
-0.76
anwhile
-0.71
Mous
-0.69
Zup
-0.63
Manhattan
-0.62
recogn
-0.61
Franch
-0.60
NEC
-0.59
çͰ
-0.59
sights
-0.58
POSITIVE LOGITS
¡
1.05
Ķ
1.01
ľ
1.00
¤
1.00
Ń
1.00
ĺ
0.96
âĢķ
0.96
¬
0.95
¢
0.95
Ļ
0.95
Activations Density 0.297%