INDEX
Explanations
phrases indicating the sharing or provision of information
instances of providing information or updates
New Auto-Interp
Negative Logits
enment
-0.68
interfered
-0.68
affair
-0.67
gain
-0.66
destruct
-0.66
roying
-0.62
land
-0.61
bec
-0.61
shore
-0.60
âĻ¥
-0.60
POSITIVE LOGITS
details
1.10
detailed
1.09
insight
1.02
specifics
0.99
assurances
0.99
clarification
0.96
examples
0.95
testimony
0.93
insights
0.93
suggestions
0.92
Activations Density 0.105%