INDEX
Explanations
specific names, likely related to characters or locations
proper nouns related to characters and places, particularly in a narrative context
New Auto-Interp
Negative Logits
tering
-0.83
Cumber
-0.75
zman
-0.72
eers
-0.68
Manip
-0.67
enburg
-0.67
Wonderland
-0.64
zel
-0.64
inval
-0.64
Box
-0.64
POSITIVE LOGITS
phia
0.95
ourse
0.81
prise
0.75
icago
0.72
prises
0.71
ĺħ
0.70
roots
0.69
WP
0.69
Trace
0.68
dain
0.68
Activations Density 0.065%