INDEX
Explanations
occurrences of the word "the."
New Auto-Interp
Negative Logits
rzy
-0.17
zig
-0.16
ensem
-0.16
formats
-0.15
rych
-0.15
lea
-0.14
panse
-0.14
#
-0.14
.idea
-0.14
raison
-0.14
POSITIVE LOGITS
standpoint
0.34
perspective
0.32
outset
0.25
perspectives
0.23
beginning
0.22
Perspective
0.21
/to
0.21
oth
0.21
pers
0.19
depths
0.19
Activations Density 0.079%