INDEX
Explanations
repeated instances of the word "that" and related conjunctions
New Auto-Interp
Negative Logits
and
-0.18
?
-0.17
1
-0.17
and
-0.17
3
-0.17
,
-0.17
inand
-0.16
(
-0.16
2
-0.16
=
-0.16
POSITIVE LOGITS
ship
0.18
ลาà¸Ķ
0.17
information
0.17
trend
0.17
process
0.16
info
0.16
endeavor
0.16
decision
0.15
idea
0.15
entire
0.15
Activations Density 0.130%