INDEX
Explanations
possessive pronouns and phrases indicating ownership or belonging
New Auto-Interp
Negative Logits
gree
-0.15
ooks
-0.15
ioneer
-0.15
indle
-0.14
eden
-0.14
anna
-0.14
enden
-0.14
mousemove
-0.14
Conway
-0.14
INTERRU
-0.14
POSITIVE LOGITS
fault
0.27
Fault
0.25
undo
0.24
Fault
0.21
fault
0.21
contribution
0.19
Achilles
0.19
Contribution
0.19
responsibility
0.19
cue
0.17
Activations Density 0.109%