INDEX
Explanations
references to time and temporal phrases
New Auto-Interp
Negative Logits
ermann
-0.16
owi
-0.16
orta
-0.15
uario
-0.15
ourd
-0.15
Pell
-0.15
ud
-0.15
erm
-0.14
vaguely
-0.14
230
-0.14
POSITIVE LOGITS
iegel
0.16
unden
0.16
rooft
0.15
stocks
0.15
ãĤ¤ãĥĪ
0.15
ÎijÎł
0.15
tones
0.15
stock
0.14
otland
0.14
omba
0.14
Activations Density 0.095%