INDEX
Explanations
mentions of the name "Jane"
mentions of the name "Jane."
New Auto-Interp
Negative Logits
ctic
-0.91
*/(
-0.88
iated
-0.79
idated
-0.75
Disclaimer
-0.73
iating
-0.68
ulative
-0.65
rite
-0.65
panic
-0.65
MAT
-0.65
POSITIVE LOGITS
Doe
1.23
Mayer
0.92
Aust
0.92
Jacobs
0.87
etta
0.82
Jane
0.81
ane
0.81
Seymour
0.79
Ey
0.79
Esp
0.77
Activations Density 0.019%