INDEX
Explanations
instances of the word "the"
New Auto-Interp
Negative Logits
Tatsache
-0.61
fact
-0.49
fact
-0.48
façon
-0.48
manière
-0.48
feit
-0.47
manera
-0.47
maneira
-0.46
meest
-0.44
sätt
-0.42
POSITIVE LOGITS
doors
1.06
curtain
0.92
IntoConstraints
0.91
wheels
0.90
tide
0.89
clocks
0.89
lights
0.88
curtains
0.88
unthinkable
0.88
pendulum
0.86
Activations Density 0.450%