INDEX
Explanations
descriptions or elaborations within a text
New Auto-Interp
Negative Logits
lasted
-0.87
0000000000000000
-0.78
succeeded
-0.77
soever
-0.77
rocked
-0.74
behaved
-0.72
harmed
-0.70
bandwagon
-0.69
accountable
-0.69
hindered
-0.68
POSITIVE LOGITS
effic
1.06
accordance
1.04
detail
1.03
conjunction
1.02
efficiency
1.01
humane
1.01
lieu
0.92
terms
0.91
advance
0.91
escap
0.90
Activations Density 0.140%