INDEX
Explanations
questions, particularly those that seek clarification or information
New Auto-Interp
Negative Logits
oire
-0.76
fread
-0.72
aus
-0.72
böz
-0.71
navbar
-0.71
Bradley
-0.70
Inters
-0.68
𝓭
-0.68
MAT
-0.67
Mait
-0.67
POSITIVE LOGITS
%?
1.73
?
1.57
?!?
1.56
؟
1.43
!?
1.41
?.
1.39
$?
1.39
’?
1.37
?!
1.37
?}
1.32
Activations Density 0.136%