INDEX
Explanations
references to literary and cultural figures
New Auto-Interp
Negative Logits
ruba
-0.16
سÙĨت
-0.16
isin
-0.15
388
-0.15
izzo
-0.15
motel
-0.14
uzzi
-0.14
UTE
-0.14
esign
-0.14
345
-0.14
POSITIVE LOGITS
Hogwarts
0.25
Dumbledore
0.22
Voldemort
0.21
Harry
0.21
Harry
0.19
Rowling
0.19
Potter
0.19
Snape
0.18
HP
0.17
foy
0.17
Activations Density 0.022%