INDEX
Explanations
references to personality traits or characteristics
New Auto-Interp
Negative Logits
zoude
-0.84
desmotivaciones
-0.73
dezelve
-0.70
zelve
-0.69
templado
-0.67
navideño
-0.65
húmedo
-0.64
mijne
-0.61
umumkan
-0.60
berdayakan
-0.60
POSITIVE LOGITS
pack
0.63
packs
0.59
beta
0.57
beta
0.56
Beta
0.54
Packs
0.54
Loop
0.53
tour
0.51
Pack
0.49
personalise
0.49
Activations Density 0.473%