INDEX
Explanations
phrases emphasizing the word "that" in various contexts
New Auto-Interp
Negative Logits
NameInMap
-0.91
pleaſure
-0.84
viſ
-0.79
eſt
-0.75
myſelf
-0.74
ſever
-0.74
Conſ
-0.71
Majefty
-0.71
VIC
-0.70
Reſ
-0.70
POSITIVE LOGITS
that
2.83
that
2.21
bahwa
1.86
That
1.68
bahawa
1.64
THAT
1.62
That
1.61
THAT
1.56
ότι
1.44
że
1.36
Activations Density 0.545%