INDEX
Explanations
references to specific individuals, particularly those named Megan or Randy
Megan, Randy, Gwen
New Auto-Interp
Negative Logits
'').
-0.47
"");
-0.45
||}
-0.44
fiore
-0.41
//////////////
-0.41
fl
-0.39
Fru
-0.39
'..
-0.38
''.
-0.38
tableFuture
-0.37
POSITIVE LOGITS
Megan
2.14
Megan
2.11
Meghan
0.96
Meghan
0.89
ainfi
0.83
Meg
0.80
Meg
0.79
megane
0.79
mauva
0.77
enfans
0.75
Activations Density 0.001%