INDEX
Explanations
references to personal relationships and interpersonal conflicts
New Auto-Interp
Negative Logits
undertaken
-0.15
depart
-0.15
borne
-0.14
RTOS
-0.14
undergone
-0.14
urchase
-0.14
arisen
-0.14
begun
-0.14
weary
-0.14
thrill
-0.14
POSITIVE LOGITS
fucked
0.25
fucking
0.21
bitch
0.20
shit
0.20
screwed
0.20
fart
0.19
fucks
0.19
pissed
0.19
quit
0.18
just
0.17
Activations Density 0.397%