INDEX
Explanations
references to specific names mentioned in the text
mentions of specific individuals and their affiliations or actions
New Auto-Interp
Negative Logits
cas
-0.73
Malays
-0.72
ahi
-0.70
plates
-0.64
tered
-0.64
ļé
-0.63
teenth
-0.63
rav
-0.62
anguages
-0.59
thood
-0.59
POSITIVE LOGITS
agan
0.80
Router
0.80
nor
0.80
ceived
0.73
INGTON
0.70
uther
0.69
actor
0.69
actions
0.68
ally
0.67
burgh
0.67
Activations Density 0.086%