INDEX
Explanations
phrases expressing experiences of disappointment or frustration
New Auto-Interp
Negative Logits
Solve
-0.22
eyim
-0.13
Trait
-0.13
оÑģÑĤи
-0.12
ooth
-0.12
antu
-0.12
401
-0.11
ught
-0.11
MethodName
-0.11
auce
-0.11
POSITIVE LOGITS
kind
1.11
type
1.07
kinds
1.05
types
0.93
kind
0.85
sort
0.82
type
0.82
sorts
0.81
tipo
0.75
-type
0.75
Activations Density 0.474%