INDEX
Explanations
References to time-based phrases or temporal expressions
New Auto-Interp
Negative Logits
éłĨ
-0.15
tabs
-0.14
Genres
-0.14
geb
-0.13
pee
-0.13
edd
-0.13
عبار
-0.13
'",
-0.13
ãĥ¨
-0.13
Dana
-0.13
POSITIVE LOGITS
twice
0.16
odus
0.15
nad
0.15
iad
0.14
itaire
0.14
ãĥ¼ãĥŀ
0.14
uth
0.13
alion
0.13
Fon
0.13
nist
0.13
Activations Density 0.852%