INDEX
Explanations
phrases related to evaluation or summary statements
statements of opinion and summaries of overall impressions
New Auto-Interp
Negative Logits
staking
-0.74
ufact
-0.67
rack
-0.66
numbered
-0.66
legged
-0.65
rils
-0.65
previously
-0.64
osponsors
-0.63
predecessors
-0.63
bor
-0.62
POSITIVE LOGITS
concludes
0.74
iem
0.72
boils
0.69
merce
0.67
ibaba
0.66
resil
0.66
takeaway
0.64
usefulness
0.63
rust
0.63
warr
0.63
Activations Density 0.206%