INDEX
Explanations
phrases related to legal actions or government involvement
New Auto-Interp
Negative Logits
uality
-0.96
empl
-0.77
ppa
-0.75
redits
-0.75
heimer
-0.72
NN
-0.69
chrome
-0.69
largeDownload
-0.68
bia
-0.68
iza
-0.68
POSITIVE LOGITS
virtue
1.21
products
0.88
laws
0.87
passers
0.85
STATS
0.83
extremists
0.81
locals
0.80
outsiders
0.80
professionals
0.77
successive
0.77
Activations Density 0.689%