INDEX
Explanations
phrases where someone is quoted as saying something
New Auto-Interp
Negative Logits
inary
-0.80
spir
-0.71
rats
-0.71
eneg
-0.69
acts
-0.69
empt
-0.68
aired
-0.67
rel
-0.65
icates
-0.65
illus
-0.64
POSITIVE LOGITS
Jonathan
0.81
David
0.79
Dawn
0.79
ï¸ı
0.79
Pamela
0.78
Laura
0.78
Joyce
0.78
Quartz
0.77
Polly
0.76
Joel
0.76
Activations Density 0.034%