INDEX
Explanations
phrases related to situations or outcomes conflicting with expectations
repeated use of the word "despite."
New Auto-Interp
Negative Logits
aird
-0.69
isen
-0.67
lees
-0.65
ecycle
-0.65
ISE
-0.63
isition
-0.62
isa
-0.61
aim
-0.61
ahime
-0.61
alky
-0.61
POSITIVE LOGITS
math
0.82
acknowledging
0.79
having
0.72
knowing
0.68
lacking
0.66
seeming
0.66
conced
0.65
ĸļ
0.64
pite
0.64
surviving
0.64
Activations Density 0.016%