INDEX
Explanations
dates or time periods
phrases related to deadlines or timeframes
New Auto-Interp
Negative Logits
Reviewer
-0.81
WATCHED
-0.64
wrong
-0.64
alogy
-0.63
oka
-0.62
tical
-0.62
ettel
-0.60
louder
-0.60
RGB
-0.60
âĺ
-0.60
POSITIVE LOGITS
hostilities
0.97
Ramadan
0.88
2020
0.85
2019
0.84
September
0.81
March
0.79
semester
0.77
August
0.77
December
0.76
June
0.76
Activations Density 0.073%