INDEX
Explanations
phrases indicating comparison or similarity
phrases indicating consistency or similarity across statements
New Auto-Interp
Negative Logits
ighters
-0.75
æĪ¦
-0.67
THEN
-0.65
orthy
-0.65
charcoal
-0.64
leased
-0.63
ffff
-0.63
whe
-0.60
INCLUD
-0.59
bang
-0.59
POSITIVE LOGITS
pmwiki
0.87
everywhere
0.70
rences
0.69
Shape
0.65
Qué
0.64
guiActive
0.63
lihood
0.63
practise
0.62
prev
0.62
precedent
0.62
Activations Density 0.435%