INDEX
Explanations
phrases related to official stances or positions on various subjects
occurrences of the word "on," particularly in political or positional contexts
New Auto-Interp
Negative Logits
ennes
-0.80
ults
-0.72
::::::::
-0.68
ENTS
-0.67
ï¸ı
-0.67
SourceFile
-0.67
MODE
-0.67
ARM
-0.65
BILITIES
-0.65
ACT
-0.65
POSITIVE LOGITS
behalf
1.65
erous
1.02
yx
0.96
etheless
0.94
steroids
0.92
etime
0.89
Capitol
0.87
matters
0.86
shore
0.85
site
0.85
Activations Density 0.233%