INDEX
Explanations
repeated phrases or patterns in the text
New Auto-Interp
Negative Logits
pim
-0.64
pound
-0.60
envy
-0.58
arming
-0.57
corridors
-0.57
latch
-0.56
Spoiler
-0.56
herald
-0.55
ribbon
-0.54
establishing
-0.54
POSITIVE LOGITS
enei
0.99
oslav
0.91
pillar
0.88
haar
0.82
iq
0.82
heid
0.79
hiro
0.79
heng
0.79
arel
0.79
bah
0.77
Activations Density 3.258%