INDEX
Explanations
adjectives that describe characteristics or qualities
New Auto-Interp
Negative Logits
Delia
-0.86
Paglinawan
-0.86
fecture
-0.84
crows
-0.83
virgins
-0.83
virginity
-0.81
imputation
-0.81
manteau
-0.79
ausea
-0.79
Slf
-0.79
POSITIVE LOGITS
idal
0.67
cing
0.59
led
0.56
}')
0.56
zyła
0.56
pá
0.56
ided
0.56
cer
0.52
ced
0.52
InstanceOf
0.52
Activations Density 0.040%