INDEX
Explanations
phrases indicating emphasis or importance, often related to comparisons
phrases emphasizing the degree or extent of something
New Auto-Interp
Negative Logits
STA
-0.72
ounded
-0.69
ersive
-0.67
aleb
-0.66
Compliance
-0.64
iets
-0.64
eele
-0.63
hua
-0.62
É
-0.61
Topics
-0.61
POSITIVE LOGITS
imaginable
1.26
except
0.96
except
0.96
conceivable
0.91
nodd
0.84
nut
0.77
thereafter
0.73
sane
0.72
whatsoever
0.72
Except
0.70
Activations Density 0.231%