INDEX
Explanations
words and phrases that indicate actions done to subjects or entities
New Auto-Interp
Negative Logits
gave
-0.62
silencio
-0.61
DockStyle
-0.61
did
-0.61
cavallo
-0.61
Stä
-0.60
rý
-0.59
Gegenteil
-0.59
nélk
-0.58
did
-0.58
POSITIVE LOGITS
flown
1.78
grown
1.60
spoken
1.60
blown
1.55
taken
1.51
risen
1.51
worn
1.47
fallen
1.45
taken
1.40
drawn
1.40
Activations Density 0.132%