INDEX
Explanations
phrases related to self-awareness and reflection
instances of the character "Ļ" in the text
New Auto-Interp
Negative Logits
Mobil
-0.59
Skydragon
-0.59
Gaul
-0.57
Polk
-0.57
Azerb
-0.55
Kurdistan
-0.55
fragmentation
-0.54
Palestin
-0.54
Rhodes
-0.54
metic
-0.53
POSITIVE LOGITS
s
1.19
t
1.03
scl
0.94
ll
0.93
sure
0.93
sent
0.91
tis
0.90
ski
0.89
mean
0.87
sed
0.85
Activations Density 0.259%