INDEX
Explanations
proper names
mentions of the name "Jane"
New Auto-Interp
Negative Logits
orescent
-0.83
fecture
-0.82
doors
-0.81
ctic
-0.80
rament
-0.76
rophic
-0.73
rophe
-0.72
tracking
-0.72
ursed
-0.72
ulative
-0.71
POSITIVE LOGITS
Doe
1.15
Jane
1.11
Jane
0.96
Seymour
0.80
Aust
0.78
ju
0.76
Foster
0.75
Jacobs
0.75
Approximately
0.74
Leilan
0.74
Activations Density 0.005%