INDEX
Explanations
names of people or characters within a given context
words related to names and personal identifiers
New Auto-Interp
Negative Logits
DPRK
-0.66
stockpile
-0.65
daylight
-0.65
SPONSORED
-0.64
______
-0.63
welfare
-0.63
CTRL
-0.59
scarcity
-0.59
token
-0.58
chronically
-0.58
POSITIVE LOGITS
akis
1.51
iere
1.45
otti
1.44
idis
1.43
neau
1.43
ini
1.43
opoulos
1.42
atos
1.42
ova
1.41
oux
1.39
Activations Density 0.285%