INDEX
Explanations
language indicating focus on particular topics or questions
phrases introducing or referencing significant questions or topics
New Auto-Interp
Negative Logits
ahime
-0.70
aq
-0.70
ugu
-0.69
iola
-0.62
hent
-0.61
pling
-0.60
say
-0.59
aga
-0.59
usk
-0.58
onday
-0.57
POSITIVE LOGITS
occurs
1.07
deserves
1.05
arises
1.04
begs
1.02
haun
1.01
ought
1.01
awaits
1.00
arose
1.00
happens
0.99
hasn
0.97
Activations Density 0.166%