INDEX
Explanations
descriptive words related to internal sensations or reactions, often negative in nature
references to strong emotional or visceral reactions
New Auto-Interp
Negative Logits
hips
-0.89
eton
-0.74
Stephenson
-0.72
Aad
-0.70
Chains
-0.67
hare
-0.66
Izan
-0.65
hift
-0.64
Naz
-0.64
cale
-0.63
POSITIVE LOGITS
ted
1.39
ierrez
1.28
ting
1.21
ters
1.21
tering
1.08
tered
1.05
warts
0.96
terson
0.94
microbiota
0.94
sy
0.92
Activations Density 0.025%