INDEX
Explanations
sentences with question marks indicating a query or uncertainty
New Auto-Interp
Negative Logits
places
-0.68
chairs
-0.67
ipel
-0.65
runners
-0.65
gra
-0.64
ilan
-0.64
chat
-0.61
forts
-0.60
agra
-0.59
mosqu
-0.59
POSITIVE LOGITS
Thou
0.87
thou
0.82
anybody
0.78
nobody
0.73
anyone
0.72
there
0.70
)),
0.67
fortunately
0.66
YOU
0.66
it
0.65
Activations Density 0.052%