INDEX
Explanations
phrases that convey subjective evaluations or characteristics of subjects
New Auto-Interp
Negative Logits
estekak
-0.57
Judea
-0.55
ientôt
-0.50
Fußballspieler
-0.49
تقاوى
-0.49
IsMutable
-0.48
tvguidetime
-0.47
$_"
-0.47
/*
-0.46
تضيفلها
-0.46
POSITIVE LOGITS
WHICH
0.54
which
0.49
Which
0.49
vilket
0.46
which
0.44
Which
0.43
pretty
0.41
což
0.38
vilken
0.38
hvilket
0.38
Activations Density 0.463%