INDEX
Explanations
conditional clauses, particularly those starting with 'if' and 'then'
conditional phrases involving sequenced actions or events
New Auto-Interp
Negative Logits
yet
-0.67
worth
-0.61
making
-0.59
framework
-0.57
language
-0.56
still
-0.55
dylib
-0.55
gorilla
-0.55
attire
-0.54
bru
-0.54
POSITIVE LOGITS
recons
0.87
éŃĶ
0.68
assador
0.68
proceeded
0.68
ramids
0.67
ividual
0.67
Äį
0.66
Ï
0.66
eln
0.66
guiActiveUn
0.65
Activations Density 0.264%