INDEX
Explanations
specific verbs related to decisions or actions
questions and expressions of doubt or uncertainty
New Auto-Interp
Negative Logits
kaya
-0.76
zbollah
-0.65
yourselves
-0.65
'/
-0.61
izo
-0.58
Variant
-0.57
attRot
-0.56
Their
-0.55
uala
-0.51
uko
-0.50
POSITIVE LOGITS
he
2.20
his
1.79
He
1.70
his
1.65
His
1.54
He
1.47
him
1.42
he
1.39
His
1.25
HIS
1.23
Activations Density 1.975%