INDEX
Explanations
words related to abstract concepts or qualities
phrases indicating a sense of belonging or community
New Auto-Interp
Negative Logits
sites
-0.81
olicy
-0.79
iaries
-0.76
nces
-0.75
nets
-0.72
Tycoon
-0.70
ģĸ
-0.69
ials
-0.69
aughters
-0.68
ittees
-0.68
POSITIVE LOGITS
urgency
0.94
warmth
0.91
humor
0.91
humour
0.87
intimacy
0.81
optimism
0.80
parity
0.76
bodily
0.75
realism
0.75
reckoning
0.75
Activations Density 0.063%