INDEX
Explanations
references to the character "Frodo" or variations of the name
proper nouns, particularly names of characters and people
New Auto-Interp
Negative Logits
×ŀ
-0.83
×
-0.78
dress
-0.75
×Ļ
-0.72
loo
-0.72
ãĤ¨ãĥ«
-0.70
ãĤ´ãĥ³
-0.70
************
-0.70
creen
-0.68
****************
-0.68
POSITIVE LOGITS
unin
0.98
ansson
0.87
atche
0.86
adh
0.83
rites
0.82
auld
0.80
ringe
0.77
aldi
0.74
licks
0.74
adows
0.72
Activations Density 0.018%