INDEX
Explanations
phrases associated with perceptions or evaluations of experiences or entities
New Auto-Interp
Negative Logits
varones
-0.55
ientôt
-0.53
harusnya
-0.52
cuentos
-0.52
aikaa
-0.49
bileklik
-0.49
mulighed
-0.49
tutkim
-0.48
tidaknya
-0.47
Identyfik
-0.47
POSITIVE LOGITS
Ah
0.71
impression
0.70
Aw
0.65
impression
0.65
Ah
0.64
Impression
0.59
Impression
0.56
Aw
0.56
away
0.55
Away
0.54
Activations Density 0.222%