INDEX
Explanations
greetings and introductions in text
greetings or introductory phrases
New Auto-Interp
Negative Logits
staking
-0.75
excess
-0.75
ificantly
-0.73
deterior
-0.72
divest
-0.70
dwind
-0.69
destruction
-0.67
clauses
-0.66
neglect
-0.65
fail
-0.64
POSITIVE LOGITS
welcome
0.83
hello
0.83
Login
0.80
Hello
0.78
Nir
0.77
Welcome
0.77
Fellow
0.77
Welcome
0.75
Morning
0.75
traveller
0.75
Activations Density 0.054%