INDEX
Explanations
phrases mentioning dates or time periods
references to significant events or trends impacting people
New Auto-Interp
Negative Logits
risked
-0.79
][
-0.65
¶
-0.60
¶
-0.60
infiltrated
-0.57
ÂŃ
-0.56
Holocaust
-0.55
Nazis
-0.55
Nazi
-0.54
])
-0.54
POSITIVE LOGITS
OIL
0.68
oother
0.67
erenn
0.64
emi
0.64
assic
0.62
enium
0.59
2019
0.59
rematch
0.59
bye
0.59
ORY
0.58
Activations Density 1.560%