INDEX
Explanations
titles or mentions of authors
the word "author" and its variations
New Auto-Interp
Negative Logits
ll
-0.75
Libre
-0.71
tone
-0.67
EMS
-0.67
Nicaragua
-0.66
fw
-0.65
Eastern
-0.63
Territories
-0.63
Bots
-0.62
ijn
-0.62
POSITIVE LOGITS
itatively
1.61
itative
1.16
itar
1.13
itarian
1.10
essee
0.95
itent
0.94
itism
0.87
izations
0.87
ournals
0.86
isations
0.85
Activations Density 0.031%