INDEX
Explanations
phrases involving the status or condition of individuals
the phrase "people who are" indicating discussions about individuals or groups
New Auto-Interp
Negative Logits
Solution
-0.72
pedia
-0.69
Which
-0.68
Inventory
-0.67
ization
-0.66
Exit
-0.64
imation
-0.63
aneous
-0.62
cancellation
-0.62
Entry
-0.62
POSITIVE LOGITS
accustomed
0.85
addicted
0.84
wolves
0.84
supposed
0.83
intimately
0.81
blinded
0.79
acquainted
0.77
interested
0.77
fluent
0.77
held
0.75
Activations Density 0.138%