INDEX
Explanations
instances of the word "Fro" at varying activation levels
references to the character Frodo and related terms, particularly in the context of "The Lord of the Rings."
New Auto-Interp
Negative Logits
ually
-0.86
iaries
-0.84
itutional
-0.77
inity
-0.75
arian
-0.74
iary
-0.73
alyst
-0.72
ascus
-0.72
inian
-0.71
ities
-0.69
POSITIVE LOGITS
xual
0.83
bie
0.77
lette
0.76
¶æ
0.75
EStreamFrame
0.75
ctic
0.73
perty
0.72
laus
0.72
ãģ¦
0.69
horses
0.68
Activations Density 0.046%