INDEX
Explanations
specific patterns of language and structure in the text, particularly the occurrences of articles and conjunctions
New Auto-Interp
Negative Logits
alette
-0.14
illet
-0.14
#af
-0.14
afort
-0.14
HLT
-0.14
Meh
-0.13
imbus
-0.13
uarios
-0.13
consect
-0.13
_mC
-0.13
POSITIVE LOGITS
eration
0.15
ounded
0.14
ler
0.14
iff
0.14
inue
0.14
ad
0.14
odge
0.13
oped
0.13
lessly
0.13
fty
0.13
Activations Density 1.741%