INDEX
Explanations
proper nouns related to different entities or individuals
entities related to organizations, movements, or political affiliations
New Auto-Interp
Negative Logits
untarily
-0.64
consent
-0.60
whisper
-0.59
onym
-0.59
eatured
-0.58
simulated
-0.58
ezvous
-0.57
alyses
-0.57
flaw
-0.55
uphem
-0.54
POSITIVE LOGITS
sake
0.96
considering
0.91
é¾įåĸļ士
0.78
because
0.77
erning
0.75
especially
0.75
lovers
0.73
concerned
0.72
morale
0.72
purposes
0.72
Activations Density 0.438%