INDEX
Explanations
phrases related to specific situations or conditions
New Auto-Interp
Negative Logits
Solve
-0.20
eyim
-0.14
Trait
-0.13
istrovstvÃŃ
-0.13
ooth
-0.12
.observe
-0.12
auce
-0.12
оÑģÑĤи
-0.12
ÑĪев
-0.12
ught
-0.12
POSITIVE LOGITS
type
1.08
kind
1.08
kinds
1.02
types
0.92
kind
0.84
type
0.82
sort
0.80
sorts
0.78
tipo
0.77
-type
0.74
Activations Density 0.463%