INDEX
Explanations
phrases related to specific names or titles
proper nouns or names
New Auto-Interp
Negative Logits
Bean
-0.94
Schwe
-0.83
GF
-0.83
bean
-0.81
Progressive
-0.79
Chloe
-0.78
bould
-0.77
Powell
-0.77
Warren
-0.76
Albany
-0.76
POSITIVE LOGITS
ir
2.04
IR
1.74
Ir
1.60
IR
1.56
iris
1.53
ir
1.46
Ir
1.46
ire
1.44
irs
1.39
iri
1.38
Activations Density 0.271%