INDEX
Explanations
repetitive phrases or structures in the text
New Auto-Interp
Negative Logits
shit
-0.16
sss
-0.16
935
-0.16
ve
-0.15
tring
-0.15
971
-0.15
ss
-0.15
seul
-0.14
son
-0.14
leans
-0.14
POSITIVE LOGITS
curity
0.22
quence
0.21
責
0.17
è´£
0.17
cond
0.16
meisten
0.15
beiden
0.15
ahlen
0.14
sut
0.14
quential
0.14
Activations Density 0.119%