INDEX
Explanations
significant references to personal experiences and emotions related to films or visual media
New Auto-Interp
Negative Logits
rak
-0.14
edin
-0.14
ÏĮμε
-0.14
ENAME
-0.14
سÙĪØ¨
-0.14
ENDOR
-0.14
anh
-0.14
oke
-0.14
ÙĪÛĮد
-0.13
orge
-0.13
POSITIVE LOGITS
besides
0.23
lies
0.20
lie
0.19
boil
0.18
boils
0.18
isto
0.18
lies
0.18
bes
0.17
apart
0.17
boiled
0.17
Activations Density 0.118%