INDEX
Explanations
phrases that express opinions or assessments about movies
New Auto-Interp
Negative Logits
ublik
-0.15
¼åIJĪ
-0.14
agus
-0.14
ư
-0.14
odyn
-0.14
kir
-0.13
ropp
-0.13
جز
-0.13
جز
-0.13
å¡ļ
-0.13
POSITIVE LOGITS
cf
0.15
дека
0.14
anki
0.14
enor
0.14
aurus
0.13
DEM
0.13
igm
0.13
Coins
0.13
iel
0.13
volatile
0.13
Activations Density 0.092%