INDEX
Explanations
proper nouns, particularly names of places, people, and titles
topics related to sports and entertainment
New Auto-Interp
Negative Logits
Salvador
-0.65
ULL
-0.59
Mell
-0.58
Bris
-0.58
Fla
-0.57
Cantor
-0.57
ocamp
-0.57
Lank
-0.56
Legislative
-0.56
Cull
-0.56
POSITIVE LOGITS
existed
0.94
ain
0.93
sucks
0.92
shouldn
0.90
doesnt
0.90
exists
0.89
hadn
0.88
?).
0.87
wouldn
0.87
hasn
0.86
Activations Density 0.852%