INDEX
Explanations
words related to proper nouns or names of individuals and places
New Auto-Interp
Negative Logits
Masquerade
-0.87
Sabb
-0.87
Tec
-0.85
Mub
-0.84
Stab
-0.78
Curiosity
-0.77
Pes
-0.75
Schwe
-0.75
Schultz
-0.72
Chong
-0.72
POSITIVE LOGITS
or
1.45
OR
1.40
oris
1.30
ors
1.27
orians
1.22
oros
1.22
orian
1.19
ori
1.18
oring
1.18
oria
1.17
Activations Density 0.195%