INDEX
Explanations
proper nouns related to a specific location or person
references to a specific name or entity
New Auto-Interp
Negative Logits
dit
-0.82
ship
-0.70
tainment
-0.67
tions
-0.66
Appropriations
-0.66
ATOR
-0.66
yrinth
-0.65
McCann
-0.65
sburgh
-0.63
LECT
-0.62
POSITIVE LOGITS
Sha
1.26
ikh
0.89
Tsu
0.87
olin
0.87
isha
0.86
osh
0.85
ivas
0.85
ulk
0.84
Yao
0.79
jah
0.77
Activations Density 0.007%