INDEX
Explanations
references to specific individuals and organizations, particularly within a political or societal context
New Auto-Interp
Negative Logits
ej
-0.19
incinn
-0.18
loan
-0.15
aggio
-0.15
loe
-0.14
am
-0.14
ÅĻeh
-0.14
lier
-0.14
liers
-0.14
igham
-0.14
POSITIVE LOGITS
uxtap
0.22
stice
0.19
igsaw
0.19
venile
0.17
ÅĽli
0.17
oints
0.16
ST
0.15
ET
0.15
affe
0.15
uggling
0.15
Activations Density 0.838%