INDEX
Explanations
phrases indicating a change or transition in time
present tense verbs and phrases indicating ongoing actions or states
New Auto-Interp
Negative Logits
Patel
-0.70
Pengu
-0.69
amba
-0.68
Carib
-0.62
similarity
-0.60
Balloon
-0.59
Pes
-0.59
beforehand
-0.58
Huss
-0.57
spont
-0.57
POSITIVE LOGITS
here
0.76
hops
0.73
EEK
0.69
adays
0.68
aukee
0.68
YA
0.67
enium
0.66
WIND
0.66
Ô
0.65
YP
0.63
Activations Density 0.382%