INDEX
Explanations
phrases related to thoughts or mental processes
references to mental and emotional experiences
New Auto-Interp
Negative Logits
Firm
-0.65
Suff
-0.61
MOD
-0.60
Tycoon
-0.59
ISS
-0.59
Mub
-0.58
Split
-0.58
inctions
-0.56
Pix
-0.56
ammy
-0.56
POSITIVE LOGITS
steps
0.87
ails
0.72
selves
0.68
eks
0.66
stal
0.65
ancest
0.65
Nightmares
0.62
swing
0.61
doorstep
0.61
fireplace
0.61
Activations Density 0.115%