INDEX
Explanations
time-related information like durations or periods
New Auto-Interp
Negative Logits
udi
-0.70
avid
-0.68
ociate
-0.67
umbnail
-0.65
oway
-0.62
isode
-0.62
otos
-0.62
rongh
-0.61
avi
-0.60
asc
-0.60
POSITIVE LOGITS
raining
1.36
unclear
1.05
impossible
1.04
easier
1.00
possible
0.98
uphill
0.96
downhill
0.94
advisable
0.91
easy
0.91
ironic
0.89
Activations Density 0.472%