INDEX
Explanations
proper nouns related to political figures and government organizations
instances of statements or quotes made by individuals or authorities
New Auto-Interp
Negative Logits
otin
-0.90
ILCS
-0.86
à¦
-0.86
=~=~
-0.80
RH
-0.75
COR
-0.69
VIDEO
-0.68
physical
-0.68
Exit
-0.67
ï¸
-0.66
POSITIVE LOGITS
doms
0.95
bluntly
0.88
goodbye
0.81
sarcast
0.79
afterward
0.75
afterwards
0.72
anecd
0.72
emphatically
0.71
aloud
0.68
repeatedly
0.66
Activations Density 0.279%