INDEX
Explanations
proper names
the name "Alex."
New Auto-Interp
Negative Logits
fare
-0.71
FU
-0.69
SHIP
-0.67
purpose
-0.67
zee
-0.65
stakes
-0.65
SourceFile
-0.63
enegger
-0.63
eling
-0.61
manship
-0.61
POSITIVE LOGITS
andra
1.01
inia
0.98
opoulos
0.96
andre
0.95
iev
0.87
alon
0.87
azines
0.85
aic
0.83
ulia
0.83
apon
0.81
Activations Density 0.013%