INDEX
Explanations
statements about challenges or difficulties
New Auto-Interp
Negative Logits
rongh
-0.77
ship
-0.66
Bounty
-0.59
ournal
-0.59
Fac
-0.58
mage
-0.58
livest
-0.58
ategory
-0.57
depend
-0.56
insula
-0.55
POSITIVE LOGITS
enough
1.01
coincidence
0.78
entimes
0.74
consolation
0.71
underest
0.70
exaggeration
0.67
imagining
0.66
coded
0.66
uphill
0.66
irony
0.65
Activations Density 0.150%