INDEX
Explanations
sentences starting with "This" followed by specific details or actions
statements expressing personal opinions or views
New Auto-Interp
Negative Logits
Solution
-0.76
)].
-0.68
Solution
-0.61
DISTRICT
-0.58
Implementation
-0.58
multiplier
-0.57
BER
-0.55
stadt
-0.55
Answer
-0.55
GOODMAN
-0.55
POSITIVE LOGITS
biography
1.06
His
1.05
his
1.04
his
0.89
Born
0.87
accol
0.86
endorsements
0.82
tattoos
0.81
autobi
0.81
stint
0.80
Activations Density 0.939%