INDEX
Explanations
names of people or characters
names and references to specific individuals or characters
New Auto-Interp
Negative Logits
loo
-0.83
orthy
-0.81
nikov
-0.80
atile
-0.78
leader
-0.76
brance
-0.76
ndra
-0.75
ophers
-0.74
opers
-0.74
arity
-0.74
POSITIVE LOGITS
rawl
0.77
0.77
0.76
Fey
0.73
Suns
0.70
astical
0.69
Mia
0.69
Downloadha
0.68
ointed
0.67
Dia
0.67
Activations Density 0.028%