INDEX
Explanations
conversational prompts for help
New Auto-Interp
Negative Logits
féri
0.40
keeper
0.39
simulator
0.38
frutos
0.37
ferry
0.36
frequency
0.35
follower
0.35
\\
0.35
validity
0.35
addicts
0.34
POSITIVE LOGITS
Ask
0.63
Homework
0.59
Expert
0.56
Question
0.55
Interact
0.55
Ask
0.54
Submit
0.53
Asking
0.52
Collabor
0.52
Connect
0.52
Activations Density 0.000%