INDEX
Explanations
proper nouns, particularly names of people and places
formulations of the verb "to be" in various tenses
New Auto-Interp
Negative Logits
izable
-0.82
Current
-0.65
maturity
-0.63
ological
-0.62
definitions
-0.62
entails
-0.61
validity
-0.60
merits
-0.60
newsletters
-0.60
Unless
-0.60
POSITIVE LOGITS
able
1.14
unable
1.08
hes
1.08
fined
0.96
fortunate
0.96
unaware
0.94
supposed
0.92
tasked
0.92
criticized
0.92
wolves
0.91
Activations Density 0.308%