INDEX
Explanations
phrases related to events, controversies, and public figures
New Auto-Interp
Negative Logits
hovah
-0.73
)</
-0.70
depends
-0.65
cannot
-0.63
doesnt
-0.62
bis
-0.61
imum
-0.60
ceases
-0.60
arta
-0.59
determines
-0.59
POSITIVE LOGITS
yesterday
0.58
Wednesday
0.57
cluded
0.56
received
0.56
Went
0.56
rattled
0.56
Thursday
0.56
Tuesday
0.55
initially
0.55
memorable
0.54
Activations Density 2.106%