INDEX
Explanations
places and countries, with a focus on Norway
New Auto-Interp
Negative Logits
ept
-0.79
ual
-0.76
place
-0.74
icago
-0.73
ually
-0.71
plain
-0.70
uring
-0.69
heet
-0.68
ulating
-0.68
###
-0.67
POSITIVE LOGITS
Bok
0.88
wegian
0.81
Norway
0.77
Norwegian
0.76
Oslo
0.72
Refugee
0.70
Andersen
0.69
Skydragon
0.68
Yard
0.66
Dag
0.66
Activations Density 7.271%