INDEX
Explanations
words related to global politics and policies
possessive forms indicating ownership or belonging
New Auto-Interp
Negative Logits
tch
-0.86
thens
-0.80
android
-0.77
wright
-0.73
cloth
-0.71
ij士
-0.71
bay
-0.71
kan
-0.71
{:-0.70
ollow
-0.70
POSITIVE LOGITS
penchant
1.38
reputation
1.31
strengths
1.30
propensity
1.26
shortcomings
1.23
failings
1.18
reliance
1.16
woes
1.15
weaknesses
1.15
tendency
1.14
Activations Density 0.244%