INDEX
Explanations
names of people and places
attributes related to individuals and their affiliations or identities
New Auto-Interp
Negative Logits
checks
-0.76
reviewers
-0.64
selection
-0.63
enton
-0.62
erence
-0.62
indexes
-0.62
supplemental
-0.61
descriptor
-0.61
liner
-0.60
coaster
-0.60
POSITIVE LOGITS
Äĩ
1.06
Downloadha
0.99
pora
0.92
rahim
0.89
oÄŁ
0.87
ÄŁ
0.86
sbm
0.86
qqa
0.86
aram
0.85
Sheikh
0.84
Activations Density 0.548%