INDEX
Explanations
help and advice
phrases that signal formal, structured exposition—disclaimers, summaries, signposting, and instructional framing within a response.
New Auto-Interp
Negative Logits
AIRMAN
0.54
สต
0.46
الأخ
0.45
الاخ
0.45
BIN
0.42
ف
0.41
KU
0.40
INumber
0.40
ضيف
0.39
CORPER
0.39
POSITIVE LOGITS
skilled
0.48
Village
0.44
食品
0.44
Detect
0.43
gauche
0.43
Crest
0.43
Skilled
0.43
Caball
0.43
Inspect
0.43
Vig
0.42
Activations Density 0.022%