INDEX
Explanations
phrases indicating significant moments or turning points in various contexts
references to significant moments or events in various contexts
New Auto-Interp
Negative Logits
killed
-0.69
ŀ
-0.64
ubs
-0.63
exceptions
-0.63
leans
-0.62
inconsist
-0.60
LOG
-0.60
exception
-0.60
ratch
-0.59
waived
-0.58
POSITIVE LOGITS
terms
1.09
efficiency
1.06
relation
1.02
regards
1.01
effic
0.99
favor
0.94
favour
0.90
escap
0.90
history
0.89
ordinate
0.87
Activations Density 0.186%