INDEX

Explanations

regarding or concerning

It detects tokens that mark or begin the model/assistant's responses—i.e., words frequently used at the start of assistant utterances.

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

把它

0.38

ಣಿ

0.34

 intuitive

0.33

quint

0.33

 这是

0.33

ന്

0.32

 ভেবে

0.32

𝖑

0.32

 ultimate

0.32

 bằng

0.31

POSITIVE LOGITS

Regarding

1.94

 Regarding

1.91

 regarding

1.88

regarding

1.68

Concerning

1.63

至于

1.61

 Concerning

1.59

 concernant

1.49

 щодо

1.44

至於

1.44

Activations Density 0.010%