INDEX
Explanations
sentences ending with the word 'then'
conditional statements or phrases that imply causation or consequences
New Auto-Interp
Negative Logits
toe
-0.69
brush
-0.67
assed
-0.66
ety
-0.65
chio
-0.64
ting
-0.64
ction
-0.60
ãĥ©ãĥ³
-0.58
ammy
-0.57
borgh
-0.57
POSITIVE LOGITS
sshd
0.84
anski
0.78
Tens
0.76
Ͻ
0.71
orthy
0.71
etheless
0.67
guiActive
0.67
Inqu
0.66
forth
0.65
surely
0.64
Activations Density 0.029%