INDEX
Explanations
names of people or entities preceded by a title or username
instances of proper nouns, often related to names or identities
New Auto-Interp
Negative Logits
Reynolds
-0.79
avia
-0.78
SK
-0.75
319
-0.74
diesel
-0.73
EV
-0.73
DK
-0.70
OV
-0.70
Interstellar
-0.70
viruses
-0.69
POSITIVE LOGITS
Bar
2.64
Bar
2.42
bar
2.30
bar
2.21
Bars
2.05
BAR
1.98
bars
1.97
bars
1.75
Barber
1.64
Barb
1.44
Activations Density 0.214%