INDEX
Explanations
words starting with "sw" or similar patterns
occurrences of the term "news."
New Auto-Interp
Negative Logits
PLAN
-0.70
chell
-0.67
caution
-0.67
cloves
-0.64
Centauri
-0.63
flares
-0.63
crush
-0.63
secretive
-0.62
restraining
-0.62
partial
-0.61
POSITIVE LOGITS
ifty
1.09
athed
1.07
ift
1.05
addle
1.04
imming
1.04
ollen
1.02
atches
1.01
itched
1.00
arf
0.97
arer
0.95
Activations Density 0.008%