INDEX
Explanations
phrases indicating an approximation or range
phrases that express uncertainty or variability in a statement
New Auto-Interp
Negative Logits
corridors
-0.65
Cy
-0.61
EDITION
-0.58
Puzz
-0.58
Vector
-0.56
bourg
-0.54
juggling
-0.53
ynski
-0.52
Mane
-0.52
AIR
-0.52
POSITIVE LOGITS
nery
1.12
leans
1.12
chard
1.11
lando
1.08
nam
0.97
acle
0.94
chid
0.94
gin
0.92
phan
0.91
acular
0.90
Activations Density 0.035%