INDEX
Explanations
phrases indicating possibility or necessity
phrases suggesting uncertainty or speculation
New Auto-Interp
Negative Logits
ĸļ
-0.87
llah
-0.65
always
-0.60
relentlessly
-0.60
è£ıç
-0.60
tirelessly
-0.60
everything
-0.59
every
-0.59
peror
-0.57
ulner
-0.56
POSITIVE LOGITS
someday
1.40
depending
0.94
momentarily
0.92
slightly
0.91
somew
0.87
some
0.87
unintentionally
0.87
slight
0.87
temporarily
0.87
inadvertently
0.86
Activations Density 0.605%