INDEX
Explanations
text describing actions or activities related to physical objects or locations
punctuations or delineators in the text
New Auto-Interp
Negative Logits
coni
-0.60
æ©
-0.56
suspic
-0.56
é¾
-0.54
explan
-0.54
Reward
-0.54
frontline
-0.54
proble
-0.53
avorite
-0.53
aky
-0.52
POSITIVE LOGITS
huh
0.85
etc
0.83
please
0.69
please
0.66
partName
0.64
govtrack
0.64
which
0.60
plus
0.60
rete
0.59
eh
0.59
Activations Density 0.191%