INDEX
Explanations
specific names of places or people
proper nouns, particularly names of people and places
New Auto-Interp
Negative Logits
onz
-0.93
sburgh
-0.86
arcer
-0.85
onder
-0.78
olkien
-0.75
idious
-0.75
ailand
-0.74
ocaust
-0.73
okemon
-0.72
reek
-0.72
POSITIVE LOGITS
Song
0.80
Swan
0.79
Ding
0.79
apple
0.79
Basin
0.77
Lake
0.76
phase
0.76
PsyNet
0.76
Stab
0.75
Lake
0.75
Activations Density 0.042%