INDEX
Explanations
specific references to dictionaries
references to dictionaries and encyclopedias
New Auto-Interp
Negative Logits
SOLD
-0.79
arters
-0.71
hips
-0.71
financing
-0.67
Fig
-0.65
fitted
-0.65
hops
-0.63
Opportun
-0.62
shoots
-0.62
itent
-0.61
POSITIVE LOGITS
Dictionary
4.02
dictionary
3.73
diction
2.64
ictionary
2.55
Encyclopedia
1.91
encyclopedia
1.67
vocabulary
1.59
lex
1.54
cyclopedia
1.54
dict
1.36
Activations Density 0.023%