INDEX
Explanations
references to specific names or locations, potentially related to a news event
references to the names "Morales" and "Morocco."
New Auto-Interp
Negative Logits
pta
-0.87
hower
-0.83
rophe
-0.79
ctive
-0.73
iago
-0.72
ting
-0.71
eq
-0.70
opia
-0.68
ysis
-0.68
ellation
-0.67
POSITIVE LOGITS
rison
0.77
Morales
0.75
andom
0.74
livest
0.71
dyl
0.70
Mor
0.69
cow
0.68
Trails
0.68
occas
0.67
Reyes
0.64
Activations Density 0.012%