INDEX
Explanations
phrases that emphasize a sense of exclusivity or limitation
the repeated use of the word "only."
New Auto-Interp
Negative Logits
idon
-0.76
insula
-0.75
Nev
-0.71
finder
-0.65
multipl
-0.62
arted
-0.61
raught
-0.59
Everest
-0.58
cent
-0.58
ordering
-0.57
POSITIVE LOGITS
marginally
1.02
incidentally
0.78
lasted
0.75
scratches
0.74
scratched
0.73
sparing
0.72
rarely
0.72
limited
0.71
kidding
0.70
isons
0.69
Activations Density 0.066%