INDEX
Explanations
instances of the word "this" and related pronouns in various contexts
New Auto-Interp
Negative Logits
ever
-0.14
æ®Ĭ
-0.14
oubted
-0.14
iT
-0.14
iler
-0.14
iesz
-0.14
lies
-0.13
ync
-0.13
tersebut
-0.13
caf
-0.13
POSITIVE LOGITS
ones
0.26
morning
0.20
timeofday
0.19
reminded
0.18
whole
0.18
'll
0.17
-IS
0.17
afternoon
0.17
certainly
0.16
guy
0.16
Activations Density 0.212%