INDEX
Explanations
phrases related to movie reviews and ratings
New Auto-Interp
Negative Logits
inkel
-0.16
opleft
-0.15
Playground
-0.15
otton
-0.15
Gi
-0.14
ätt
-0.14
LoggerFactory
-0.14
rellas
-0.14
ÙİØª
-0.14
iples
-0.14
POSITIVE LOGITS
_RENDERER
0.15
361
0.15
thon
0.15
-render
0.15
currently
0.15
gré
0.14
currently
0.14
ÙĪØ´
0.14
ะ
0.14
>Show
0.14
Activations Density 0.130%