INDEX
Explanations
references to specific countries
references to countries and companies
New Auto-Interp
Negative Logits
ories
-0.74
ests
-0.66
inations
-0.66
Kid
-0.66
ernels
-0.64
bolts
-0.64
Machines
-0.63
subtitles
-0.63
batteries
-0.62
inez
-0.62
POSITIVE LOGITS
Called
0.93
whose
0.78
wide
0.77
LIKE
0.77
onym
0.75
resembling
0.72
Named
0.72
specializing
0.71
starved
0.71
atical
0.69
Activations Density 0.332%