INDEX
Explanations
references to personal connections or ownership
New Auto-Interp
Negative Logits
orum
-0.16
fad
-0.14
Surround
-0.14
orde
-0.14
inh
-0.14
inx
-0.14
uilt
-0.14
Zi
-0.14
aren
-0.14
IndexOf
-0.13
POSITIVE LOGITS
presence
0.22
irut
0.18
existence
0.17
presence
0.16
Presence
0.15
voice
0.15
arrival
0.15
reign
0.15
exist
0.14
-kind
0.14
Activations Density 0.307%