INDEX
Explanations
specific names and phrases related to politics or current affairs
mentions of specific individuals, groups, or entities, particularly in a sociopolitical context
New Auto-Interp
Negative Logits
dule
-0.66
otonin
-0.65
umbn
-0.65
iculty
-0.65
âĸ¬âĸ¬
-0.61
igm
-0.61
():
-0.59
iatus
-0.58
âĿ
-0.58
verages
-0.57
POSITIVE LOGITS
notwithstanding
1.60
preferring
0.97
included
0.95
said
0.89
permitting
0.84
being
0.83
argues
0.79
says
0.76
assures
0.75
explains
0.72
Activations Density 0.633%