INDEX
Explanations
instances of the word "its" in various contexts
New Auto-Interp
Negative Logits
Experience
-0.16
312
-0.15
things
-0.15
eg
-0.14
guy
-0.14
Choice
-0.14
experience
-0.13
thing
-0.13
Same
-0.13
ans
-0.13
POSITIVE LOGITS
’
0.34
'
0.31
lef
0.29
contents
0.28
contents
0.25
entirety
0.24
existence
0.22
inception
0.21
creator
0.21
elves
0.21
Activations Density 0.271%