INDEX
Explanations
phrases related to personal experience and identification
New Auto-Interp
Negative Logits
s
-1.02
Witt
-0.81
-0.78
Assisi
-0.76
meng
-0.75
Omen
-0.72
Bub
-0.71
οποία
-0.71
ses
-0.71
ns
-0.71
POSITIVE LOGITS
صوتيه
0.92
στη
0.88
waukee
0.85
Bue
0.82
Buxton
0.81
detainees
0.81
Parke
0.80
aDecoder
0.80
bilang
0.80
detenidos
0.79
Activations Density 0.058%