INDEX
Explanations
mentions of political and governmental entities or activities
references to various political and societal entities and their actions
New Auto-Interp
Negative Logits
ãĥ¥
-0.71
aughter
-0.66
anya
-0.62
REE
-0.56
enture
-0.56
[];
-0.56
ety
-0.55
atching
-0.55
rating
-0.55
ocalypse
-0.55
POSITIVE LOGITS
alas
0.88
albeit
0.84
meanwhile
0.81
namely
0.78
ItemThumbnailImage
0.76
unsurprisingly
0.74
however
0.73
huh
0.73
sensing
0.73
aka
0.72
Activations Density 0.345%