INDEX
Explanations
numbers or symbols that are not typical in natural language text
sections of text that include high-frequency or common phrases and possibly denote general statements
New Auto-Interp
Negative Logits
tack
-0.83
estab
-0.75
exha
-0.73
interven
-0.73
prec
-0.72
antit
-0.71
tram
-0.71
sculpt
-0.71
stabil
-0.69
overpower
-0.69
POSITIVE LOGITS
When
1.40
Sometimes
1.40
Yesterday
1.35
Whether
1.34
Today
1.33
If
1.33
Unless
1.31
Whenever
1.30
Following
1.30
There
1.29
Activations Density 0.208%