INDEX
Explanations
the exact word "word" occurring in the text
multiple instances of the term "word" in various contexts
New Auto-Interp
Negative Logits
psey
-0.95
etheus
-0.74
opausal
-0.73
orsche
-0.73
âĹ¼
-0.70
ithing
-0.70
aples
-0.70
roxy
-0.69
ruciating
-0.69
ervation
-0.67
POSITIVE LOGITS
word
1.03
sworth
0.97
press
0.97
naire
0.89
Word
0.88
Word
0.87
word
0.75
lore
0.74
mith
0.73
aloud
0.73
Activations Density 0.012%