INDEX
Explanations
the names of individuals in official or leadership positions
the word "the," indicating a focus on definite articles in the text
New Auto-Interp
Negative Logits
mania
-0.70
ubi
-0.66
anye
-0.65
worn
-0.65
masturb
-0.64
goers
-0.62
rehears
-0.61
aji
-0.61
disrespect
-0.61
nostalg
-0.61
POSITIVE LOGITS
aforementioned
0.99
largest
0.91
same
0.90
highest
0.88
National
0.88
smallest
0.86
International
0.86
latter
0.85
United
0.84
Americas
0.83
Activations Density 0.388%