INDEX
Explanations
phrases that indicate classifications or types with a focus on personal experiences
New Auto-Interp
Negative Logits
kılı
-0.63
xlabel
-0.60
poitrine
-0.59
pleaſure
-0.59
unut
-0.59
vertes
-0.59
cauſe
-0.57
nicio
-0.57
noires
-0.56
iload
-0.55
POSITIVE LOGITS
Kinda
1.31
sorta
1.26
Kinda
1.24
kinda
1.14
kinda
1.09
Somewhat
1.07
complexContent
1.06
propOrder
0.98
Somewhat
0.95
somewhat
0.93
Activations Density 0.098%