INDEX
Explanations
numeric counts or instances
instances of the word "count" in various contexts
New Auto-Interp
Negative Logits
por
-0.70
perty
-0.69
etheus
-0.65
TPP
-0.62
impulse
-0.61
agar
-0.61
Flavoring
-0.60
obser
-0.59
Jinn
-0.59
wegian
-0.58
POSITIVE LOGITS
enance
1.89
downs
1.04
esses
0.84
ENCY
0.84
ess
0.83
ries
0.75
count
0.74
books
0.73
rooms
0.73
offs
0.72
Activations Density 0.011%