INDEX
Explanations
the word "a" followed by an adjective
the article "a" in various contexts
New Auto-Interp
Negative Logits
develops
-0.73
otte
-0.69
flies
-0.65
ocity
-0.63
isms
-0.63
python
-0.62
chuk
-0.61
Init
-0.61
writes
-0.60
otiation
-0.60
POSITIVE LOGITS
convenient
0.82
testament
0.80
tricky
0.79
pity
0.79
reminder
0.79
bit
0.79
lot
0.78
handy
0.76
roud
0.76
huge
0.76
Activations Density 0.081%