INDEX
Explanations
mention of flat things or the word "flat" with varying intensity
references to flatness or the term "flat" in various contexts
New Auto-Interp
Negative Logits
IVERS
-0.77
utra
-0.75
warr
-0.74
ANGEL
-0.74
appropri
-0.72
GBT
-0.71
EMENT
-0.70
ymes
-0.68
EStream
-0.68
REDACTED
-0.62
POSITIVE LOGITS
ulence
1.47
ulent
1.37
tered
1.36
iron
1.20
bread
1.16
bed
1.14
lining
1.12
bush
1.09
ters
1.07
tering
1.02
Activations Density 0.036%