INDEX
Explanations
phrases related to announcements or predictions
phrases indicating predictions or expectations about future events
New Auto-Interp
Negative Logits
ahime
-0.77
IG
-0.77
lamm
-0.77
lik
-0.77
te
-0.72
jing
-0.72
Deng
-0.71
empl
-0.70
ii
-0.69
ieg
-0.68
POSITIVE LOGITS
Per
2.34
Per
2.29
Perry
1.93
per
1.77
PER
1.76
Perkins
1.75
PER
1.59
Perl
1.51
Peru
1.38
Percy
1.38
Activations Density 0.158%